Re: [sqlite] Compression for ft5
Hello ! Yes you are right the compression need to be defined by each field that you want to be compressed, I did it because I need some fields that the general size do not justify the overhead of the compression. Cheers ! On 25/09/2018 14:29, Wout Mertens wrote: This is really cool, thanks for sharing! I wonder though, is the compression done per field? I read the source but I couldn't figure it out quickly (not really used to the sqlite codebase). What are the compression ratios you achieve? Wout. On Mon, Sep 24, 2018 at 3:58 PM Domingo Alvarez Duarte wrote: Hello ! After looking at how compression is implemented in fts3 and wanting the same for fts5 I managed to get a working implementation that I'm sharing here with the same license as sqlite3 in hope it can be useful to others and maybe be added to sqlite3. Cheers ! Here is on implementation of optional compression and min_word_size for columns in fts5: === create virtual table if not exists docs_fts using fts5( doc_fname unindexed, doc_data compressed, compress=compress, uncompress=uncompress, tokenize = 'unicode61 min_word_size=3' ); === https://gist.github.com/mingodad/7fdec8eebdde70ee388db60855760c72 And here is an implementation of optional compression for columns in fts3/4: === create virtual table if not exists docs_fts using fts4( doc_fname, doc_data, tokenize = 'unicode61', notindexed=doc_fname, notcompressed=doc_fname, compress=compress, uncompress=uncompress ); === https://gist.github.com/mingodad/2f05cd1280d58f93f89133b2a2011a4d ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Compression for ft5
This is really cool, thanks for sharing! I wonder though, is the compression done per field? I read the source but I couldn't figure it out quickly (not really used to the sqlite codebase). What are the compression ratios you achieve? Wout. On Mon, Sep 24, 2018 at 3:58 PM Domingo Alvarez Duarte wrote: > Hello ! > > After looking at how compression is implemented in fts3 and wanting the > same for fts5 I managed to get a working implementation that I'm sharing > here with the same license as sqlite3 in hope it can be useful to others > and maybe be added to sqlite3. > > Cheers ! > > > Here is on implementation of optional compression and min_word_size for > columns in fts5: > > === > > create virtual table if not exists docs_fts using fts5( > doc_fname unindexed, doc_data compressed, > compress=compress, uncompress=uncompress, > tokenize = 'unicode61 min_word_size=3' > ); > > === > > https://gist.github.com/mingodad/7fdec8eebdde70ee388db60855760c72 > > > And here is an implementation of optional compression for columns in > fts3/4: > > === > > create virtual table if not exists docs_fts using fts4( > doc_fname, doc_data, > tokenize = 'unicode61', > notindexed=doc_fname, notcompressed=doc_fname, > compress=compress, uncompress=uncompress > ); > > === > > https://gist.github.com/mingodad/2f05cd1280d58f93f89133b2a2011a4d > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] compression
On 9/28/05, Dennis Jenkins <[EMAIL PROTECTED]> wrote: > > > This is from my Gentoo 2005.1 Linux system (home) with whatever packages > it installed a few days ago. > At work I use FreeBSD and I've not used loopback devices there, but the > FreeBSD Handbook (online for free) shows how to do it. > > > I'm using gentoo at home too so that won't be problem :) -- --- The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's Call http://www.lulu.com/content/77264
Re: [sqlite] compression
Jay Sprenkle wrote: On 9/28/05, Dennis Jenkins <[EMAIL PROTECTED]> wrote: Your third statement is not true. On Linux (and FreeBSD, but FreeBSD does not have Reiser as far as I know) you can treat a regular file as if it were a filesystem and mount that fiel system via the "loop back" device. You can mount an ISO image file as an actual CD, for instance. Cool! Thanks for letting us know :) I should have mentioned the obvious though... the file must be an image of a valid file system. for example, the following will fail: dd if=/dev/zero of=file bs=4096 count=1024 losetup /dev/loop0 file mount /dev/loop0 /mnt/xxx However, the following should work: dd if=/dev/zero of=file bs=4096 count=1024 losetup /dev/loop0 file mke2fs /dev/loop0 mount /dev/loop0 /mnt/xxx You can even encrypt the entire filesystem over loop back: dd if=/dev/zero of=blob bs=1M count=1024 losetup -e AES256 /dev/loop0 blob mke3fs /dev/loop0 blob mount /dev/loop0 /mnt/crypto as usual, do a "man" on "losetup". This is from my Gentoo 2005.1 Linux system (home) with whatever packages it installed a few days ago. At work I use FreeBSD and I've not used loopback devices there, but the FreeBSD Handbook (online for free) shows how to do it.
Re: [sqlite] compression
Christian Smith wrote: On Wed, 28 Sep 2005, Sid Liu wrote: Is there a possibility that this Reiser 4 be used on a file, rather than a file system? Hopefully on Windows? Reiser FS is a filesystem. It manages files. So it cannot be used on a file. Your third statement is not true. On Linux (and FreeBSD, but FreeBSD does not have Reiser as far as I know) you can treat a regular file as if it were a filesystem and mount that fiel system via the "loop back" device. You can mount an ISO image file as an actual CD, for instance. Years ago I imaged all of my old DOS floppies. I access them via the loop back device now. In theory, you can do that with any file system that can use a block device (ntfs, iso9660, ext3, etc...) but not with nfs, smbfs, proc, etc... dd if=/dev/fd0 of=floppy_file.img # Eject floppy, don't need it anymore. losetup /dev/loop0 floppy_file.img mount -t vfat /dev/loop0 /mnt/floppy ls -l /mnt/floppy Windows NTFS already has compressed files. Right click a file or directory in exporer, select propeties, then advanced attributes. You can turn on compression there. Don't know how to do it from the command line, though.
Re: [sqlite] compression
On Wed, 28 Sep 2005, Sid Liu wrote: >Is there a possibility that this Reiser 4 be used on a >file, rather than a file system? Hopefully on Windows? Reiser FS is a filesystem. It manages files. So it cannot be used on a file. Windows NTFS already has compressed files. Right click a file or directory in exporer, select propeties, then advanced attributes. You can turn on compression there. Don't know how to do it from the command line, though. > >--- Jay Sprenkle <[EMAIL PROTECTED]> wrote: > >> If you're on Linux read about the Reiser 4 file >> system. >> They found they could compress the entire file >> system on the fly and achieve >> higher performance as well. Most CPU's can compress >> and move data faster >> because they make up the difference on the slow I/O >> channels to hard disks. >> Might be a much easier solution >> >> >> On 9/24/05, Martin Pfeifle <[EMAIL PROTECTED]> >> wrote: >> > >> > Hello, >> > does anybody know whether it is possible to >> compress >> > sqlite data on the page level. If I compress the >> > sqlite database file with zlib I get very high >> > compression rates due to the character of the >> stored >> > data. >> > I think this problem is related to the problem of >> > using encrypted databases. Perheps it is possible >> just >> > to exchange the encryption function call by a zlib >> > compression call. >> > Integrating such a call simply into the read and >> write >> > functions in the file os_win.c does not work. >> > Can anybody help me, or give me a hint? >> > Ciao Martin >> > >> > >> > >> > >> >___ >> > Was denken Sie über E-Mail? Wir hören auf Ihre >> Meinung: >> > http://surveylink.yahoo.com/wix/p0379378.aspx >> > >> >> >> >> -- >> --- >> The Castles of Dereth Calendar: a tour of the art >> and architecture of >> Asheron's Call >> http://www.lulu.com/content/77264 >> > > -- /"\ \ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL X - AGAINST MS ATTACHMENTS / \
Re: [sqlite] compression
Is there a possibility that this Reiser 4 be used on a file, rather than a file system? Hopefully on Windows? --- Jay Sprenkle <[EMAIL PROTECTED]> wrote: > If you're on Linux read about the Reiser 4 file > system. > They found they could compress the entire file > system on the fly and achieve > higher performance as well. Most CPU's can compress > and move data faster > because they make up the difference on the slow I/O > channels to hard disks. > Might be a much easier solution > > > On 9/24/05, Martin Pfeifle <[EMAIL PROTECTED]> > wrote: > > > > Hello, > > does anybody know whether it is possible to > compress > > sqlite data on the page level. If I compress the > > sqlite database file with zlib I get very high > > compression rates due to the character of the > stored > > data. > > I think this problem is related to the problem of > > using encrypted databases. Perheps it is possible > just > > to exchange the encryption function call by a zlib > > compression call. > > Integrating such a call simply into the read and > write > > functions in the file os_win.c does not work. > > Can anybody help me, or give me a hint? > > Ciao Martin > > > > > > > > > ___ > > Was denken Sie über E-Mail? Wir hören auf Ihre > Meinung: > > http://surveylink.yahoo.com/wix/p0379378.aspx > > > > > > -- > --- > The Castles of Dereth Calendar: a tour of the art > and architecture of > Asheron's Call > http://www.lulu.com/content/77264 >
Re: [sqlite] compression
If you're on Linux read about the Reiser 4 file system. They found they could compress the entire file system on the fly and achieve higher performance as well. Most CPU's can compress and move data faster because they make up the difference on the slow I/O channels to hard disks. Might be a much easier solution On 9/24/05, Martin Pfeifle <[EMAIL PROTECTED]> wrote: > > Hello, > does anybody know whether it is possible to compress > sqlite data on the page level. If I compress the > sqlite database file with zlib I get very high > compression rates due to the character of the stored > data. > I think this problem is related to the problem of > using encrypted databases. Perheps it is possible just > to exchange the encryption function call by a zlib > compression call. > Integrating such a call simply into the read and write > functions in the file os_win.c does not work. > Can anybody help me, or give me a hint? > Ciao Martin > > > > ___ > Was denken Sie über E-Mail? Wir hören auf Ihre Meinung: > http://surveylink.yahoo.com/wix/p0379378.aspx > -- --- The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's Call http://www.lulu.com/content/77264
Re: [sqlite] Compression
[EMAIL PROTECTED] wrote: Hello all, First of all, allow me to wish everyone a Happy New Year and I hope it'll be a good one for all. My question is (and I've raised this topic back in September, but didn't get back to it since), does anyone have a free/commercial add-on for SQLite v3 to perform on-the-fly compression/decompression of data, preferably on a field level (compress just one of the fields, not the whole table)? Thank you, Dennis I had a file size problem so I considered this. My googling didn't turn up any solutions and then, upon further thought, I decided that this probably wouldn't work for most applications. I think the data would have to be compressed on a field basis and in that case, I think you would only get good compression on fairly long fields and then only if you didn't ever want to use those in a query. (I have to assume that decompressing fields in order to find out whether they match a where condition would be deathly slow... I assume that it would cause a lot of problems with indexing as well...) So, under those conditions, it seems like you could move the compression/decompression out of SQLite and into your program... compress strings before you write them to the database; decompress strings after you're retrieved them. I may have a need like this. However, the original problem didn't fit this structure (no long strings), so I never pursued this... -Alan -- Alan Mead - [EMAIL PROTECTED] People often find it easier to be a result of the past than a cause of the future.
Re: [sqlite] Compression
Well, actually that's exactly what I need - compression of large fields, not the whole database. Dennis // MCP, MCSD // ASP Developer Member // Software for animal shelters! // www.smartpethealth.com // www.amazingfiles.com - Original Message - From: <[EMAIL PROTECTED]> To: Sent: Sunday, January 02, 2005 12:42 PM Subject: Re: [sqlite] Compression Compression in the DB is interesting I think the commercial prod mentioned just does a field compress and that is all. In general this only works on larger blob like fiels as the overhead of the compressor is usually somewhat high and lets not forget extra overhead of comp/decomp. The idea I was playing with a while back (zlib) was a global db dictionary for compression, but as memory got cheep and larger I dropped the project. The simple token compression (like the old days of faircom's btree package, sybase IQ, Monet, etc are much nicer). It would be a neat feature. Sandy My question is (and I've raised this topic back in September, but didn't get back to it since), does anyone have a free/commercial add-on for SQLite v3 to perform on-the-fly compression/decompression of data, preferably on a field level (compress just one of the fields, not the whole table)? Related to this I would love to see reference counting of values. For example if I add the string "foobar" in 27 different places, it only gets stored once with a reference count of 27. There are various places that have done compression: http://www.sqliteplus.com/ There is also mention of compression at http://www.hwaci.com/sw/sqlite/prosupport.html If you are working on a commercial product and SQLite has made your product better and/or improved your development process then it is fair and worthwhile to pay for that. Roger
Re: [sqlite] Compression
Compression in the DB is interesting I think the commercial prod mentioned just does a field compress and that is all. In general this only works on larger blob like fiels as the overhead of the compressor is usually somewhat high and lets not forget extra overhead of comp/decomp. The idea I was playing with a while back (zlib) was a global db dictionary for compression, but as memory got cheep and larger I dropped the project. The simple token compression (like the old days of faircom's btree package, sybase IQ, Monet, etc are much nicer). It would be a neat feature. Sandy >> My question is (and I've raised this topic back in September, but >> didn't get back to it since), does anyone have a free/commercial >> add-on for SQLite v3 to perform on-the-fly compression/decompression >> of data, preferably on a field level (compress just one of the fields, >> not the whole table)? > > Related to this I would love to see reference counting of values. > For example if I add the string "foobar" in 27 different places, > it only gets stored once with a reference count of 27. > > There are various places that have done compression: > > http://www.sqliteplus.com/ > > There is also mention of compression at > > http://www.hwaci.com/sw/sqlite/prosupport.html > > If you are working on a commercial product and SQLite has made your > product better and/or improved your development process then it is > fair and worthwhile to pay for that. > > Roger >
Re: [sqlite] Compression
Thank you Roger, I'm not against paying (that's why I said "commercial"). Have you (or anyone) used SQLite++? What are your thoughts on it? Thank you, Dennis // MCP, MCSD // ASP Developer Member // Software for animal shelters! // www.smartpethealth.com // www.amazingfiles.com - Original Message - From: "Roger Binns" <[EMAIL PROTECTED]> To: Sent: Sunday, January 02, 2005 11:18 AM Subject: Re: [sqlite] Compression My question is (and I've raised this topic back in September, but didn't get back to it since), does anyone have a free/commercial add-on for SQLite v3 to perform on-the-fly compression/decompression of data, preferably on a field level (compress just one of the fields, not the whole table)? Related to this I would love to see reference counting of values. For example if I add the string "foobar" in 27 different places, it only gets stored once with a reference count of 27. There are various places that have done compression: http://www.sqliteplus.com/ There is also mention of compression at http://www.hwaci.com/sw/sqlite/prosupport.html If you are working on a commercial product and SQLite has made your product better and/or improved your development process then it is fair and worthwhile to pay for that. Roger
Re: [sqlite] Compression
My question is (and I've raised this topic back in September, but didn't get back to it since), does anyone have a free/commercial add-on for SQLite v3 to perform on-the-fly compression/decompression of data, preferably on a field level (compress just one of the fields, not the whole table)? Related to this I would love to see reference counting of values. For example if I add the string "foobar" in 27 different places, it only gets stored once with a reference count of 27. There are various places that have done compression: http://www.sqliteplus.com/ There is also mention of compression at http://www.hwaci.com/sw/sqlite/prosupport.html If you are working on a commercial product and SQLite has made your product better and/or improved your development process then it is fair and worthwhile to pay for that. Roger