www.sixfingeredman.net .................................................. ::. . . . . . . |
HOME readme brain ideas todo writing photos graphics projects quotes recipes books movies links old site |
personal database brainstormFirst, we need some way to connect and relate topics. On one end, we have the filesystem, which provides very poor structural tools. On the other, we have free text. A database or something close to one is a good compromise between expresiveness and well-definedness, but unfortunately this plays very poorly with the filesystem. That is, it mandates a great deal of structure over the fs unless you have something like the future reiserfs. So I conclude it's best to ignore the filesystem for structural purposes and instead provide a very simple hyperlink structure ala wiki. This gives us a set of glue files, or structure files, which are indeed probably very like wiki. Now the question is how to fit filesystem elements into it. Does it contain the filesystem, or does the filesystem contain it? Are files mapped into nodes, or are nodes mapped into files? Which structure would we rather work with? Currently I'm leaning towards files mapping onto nodes. That is, we provide some way to embed files and directories in these wiki files, and then work only with them, ignoring the fact that they are themselves files in directories. Of course, this makes things a bit awkward for an existing fs structure. Maybe it'd be better to have nodes mapped onto files. More refinements. First, it is important to consider the psychological differences between linking and searching. I said earlier that they were the same, but they're not. The linker knows much more about the contents of the database. Rule: links may contain too little information, but never too much. That is, I may link to [Blue] and get Blue's Clues and Blue Velvet, but I'll never link to [Blue Velvet] if there's no such thing. This is a simple consequence of the fact that you forget things that exist more easily than you remember things that don't exist. I suggest also that due to the small size and familiarity with the database, only the three categories described above (primary keys, secondary keys, and types) are necessary. More advanced structures like "Artist", etc. are simply not required. I suggest furthermore that (much like in Everything), the name, or primary key of a node, along with its type, is sufficient for unique identification. This is allowed by the variety of types, which do all the "namespacing" necessary. Principle: structure resolves ambiguity. (When I say "Tolkien", do I want books about the author, or books by the author?) This is not necessary in small databases, which have very little overlap. One important thing is that there should be some way of hiding objects from certain searchers. Or should there? Good question. Probably yeah. We can tag specific nodes as public and then say any node linked to by a public node must be public. This saves a lot of work but still ensures "opt-in" publication. First thought is to have a series of organic modules which each try and apply themselves to a file in order to create a node. Some things will apply to almost every file, for example the "keyword normalization" algorithm, and the adding of secondary keywords based on path. |
A man is not usually called upon to have an opinion of his own talents at all, since he can very well go on improving them to the best of his ability without deciding on his own precise niche in the temple of Fame. -- C.S. Lewis, The Screwtape Letters