| Vista Disappointments 1 - The Filing System |
|
| Contributed by Alan Lewis | ||
|
Microsoft's new operating system, Windows Vista, is "imminent". Whilst the glossies are gushing over it and salivating with anticipation, I'm a little disappointed. Microsoft's new operating system, Windows Vista, is "imminent". OK, the release date has slipped and slipped, and even now Microsoft is only talking of a business release in November 2006, with consumer versions in early 2007 (and don't ask me what the difference is!). Preview and beta versions abound, and there is a great deal of excitement about it. Although it is still in beta status, I'm disappointed with it already. You might have seen the multi-page spreads in some glossy mags and on-line reviews, oozing praise on it's 'new' interface and 'new' task-bar icons. Frankly, I expected a little more. Before I start, I will caveat that the version I have seen is beta code, and so full of debugging code that it runs like a dog. But, and its a big but, unless there is some radial code overhaul of lot of PCs might struggle to run it. Its going to need a lot of RAM (I'm expecting it will need 1GB just to load the OS... more for actually using any programs), a hefty CPU or two, and a decent, modern, bang-up-to-date graphics card. I'll come back to the graphics card, as first of all I was hoping to see a major overhaul of something hidden and rather more fundamental. The filing system. Yep, the bit that you don't really interact with. The bit that lets you store programs on the hard disk, save files, etc. The bit of software glue and logic that sits, invisibly, between us and the actual hard disk. As users, we don't have any real interaction with the filing system. We install programs, open them, open and save letters and data files, music, and so forth. The limit of our interaction is the open/save dialog box. At a very low level, the filing system handles how the ones and zeroes that make up each file are written onto the disk, what sort of checks are made to ensure it was written properly or read correctly, and that the file stays intact. At a higher level, it does exactly what it says on the can - it files data away. Now much like a filing cabinet, it keeps everything together. And in the same way as a filing cabinet, it relies on you to decide the order of files and folders, the basic organisation of how everything is grouped. You can throw everything into one folder, or have as many folders as you desire. A neat and well ordered filing system, or an absolute mess. Either way, it maintains a directory of which files are on a disk, and (invisibly to us) where they are. And like a filing cabinet's documents, you can have multiple copies of files. As many as you want. Which is why I am so annoyed that the proposed underlying meta-based file system is not going to be shipped included. For those who don't know, it was/is a completely new way of presenting the way files are represented on disk. Under the current Windows filing systems (NTFS for XP, w2k, NT4; FAT32 for Win 95, 98, Me) the OS forces file and directory naming on you, by way of categorisation; you put everything in one or more folders. You cannot save a file without a name (no complaints there!) but you have to use folders to store everything in. Well, you could *try* to stick everything in one place, but it has limits. Quite apart from the fact that you would have a very long search each time you wanted to find anything. So you use folders, much like the folders in a filing cabinet. One stores like files with like, perhaps categorising such as "letters", "CV", "music files", and so forth. In essence, the directory and the file structure is essentially a flat file database. Flat file, because it is all just one list. The different folders are just subheadings, if you like. Which is great providing everything neatly dovetails like that. Somethings don't. In fact, a lot of things don't. And these things are the types of file that have become prevalent since the Internet kick-started digital media. Lets take an example of a CD collection, held as MP3. You store tracks in a folder named after the album, which is great. But how do you collect the albums? By genre? By year? What about misc albums you only have one or two examples of? You may have your album list held in a database, which you index by group, genre, year, etc. Its an easy job to list all your punk or jazz, or or all albums from 1995, or all albums with Johnny Rotten. Some people have done this since the early days of home computers, and the advent of relational databases enabled cross-referencing, IE moving from a simple flat list of every album owned, to categorisation (and hence searching) by genre, band, year, track and so forth. But that's a list of LPs and CDs, all of which are stored on a shelf or similar, and hence detailed very easily in a list. Consider how do you store electronic music files. The logical manner is in the same way. after all, physical disks have to be stored on the shelf. But electronic files can be stored anyway you want. And that's a key enabler of technology, surely? If you wanted to play every punk record from the year 1977, you would traditionally have to use a CD juke box, or a lot of disk swapping and track programming. A computer should make this easier. Now, there is software that can do this, granted. For example, Media Player tries to categorise everything. But this relies on the information already existing, as tags in the file or available from an on-line database. And what happens if you don't actually want to play the file, just see what you have? Oh, and media player can bomb out when one has a lot of media files. It crashed after hitting 45,000 or so on my system... and it is not a wimp machine. This is where the current filing system falls down. It forces the directory structure on you. You can have an album or track listed under Punk/Clash, and 1977/albums, but you need two copies of it; duplication, a waste of space, and does not help for anything other than storing the file. Taking the analogy further, searching for a file (and worse, duplicates), potentially takes a *long* time. One has to wade through directories. The proposed new file system essentially fused the disk filing system with a relational database; the database was the filing system, the filing system was a database; where files are the records, and indexes (ie album name, artist, genre, year and so forth) are attributes. "But Windows ship with an indexing service, and Find Files function?" Yes. The indexing service can really impact system performance, and Find Files is slow. Indexing is bolt-on to the file system, and not an integral feature. It eases [ahem] locating a file by name, or with a little luck, by type, ie picture, music file. To use our example, it cannot find all punk tracks from 1978, for example, certainly not without a lot of effort. "OK", you say, "I use Google Desktop Search/MS Search/whatever. Whats the difference, I can search". True, but again these search engines are separate to the file system, and only make searching easier. And these search tools still suffer from a lengthy indexing service - the initial search cataloguing drive contents, updating, and of course running such software imposes a performance drag and a drain on system resources. And they are not 100% accurate, often failing to find files. Neither indexing or desktop search address the underlying file/directory structure - a single-indexed hierarchical structure is still forced upon you. To have a file listed under two locations, one needs two copies of the file. And when one looks beyond the 'home use' application, a database-centric filing system offers tremendous business opportunities. Consider the typical workplace, with multiple copies of files held everywhere and anywhere, from multiple locations on a many user's local drives, as well as copies held on servers. And those held in email. This increases storage and backup requirements, and makes revision control a nightmare. Not to mention keeping track of which is the latest version of a given document. There are groupware and collaboration tools that can ease this burden, but even these suffer from the multiple copy problem if the software is not configured correctly. Some people use Outlook as a repository, holding documents there. Great, but it still suffers from the multiple copy issue when you send - or receive - copies of documents. Personally, I was disappointed at WinFS being dropped. Hopefully it will ship later, perhaps in a service pack. Or maybe MS have decided it is only of use to the corporate market and will ship in a server-oriented version of Vista. I hope not, as it offers tremendous potential, both to home and corporate users. Quite apart from anything else, Vista needs enough horsepower just to run itself, without wasting rather a lot of resources on what should be a trivial function of the filing system. Only registered users can write comments.
|
||