Re: [sqlite] Compression for ft5

2018-09-25 Thread Domingo Alvarez Duarte

Hello !

Yes you are right the compression need to be defined by each field that 
you want to be compressed, I did it because I need some fields that the 
general size do not justify the overhead of the compression.


Cheers !

On 25/09/2018 14:29, Wout Mertens wrote:

This is really cool, thanks for sharing!

I wonder though, is the compression done per field? I read the source but I
couldn't figure it out quickly (not really used to the sqlite codebase).
What are the compression ratios you achieve?


Wout.


On Mon, Sep 24, 2018 at 3:58 PM Domingo Alvarez Duarte 
wrote:


Hello !

After looking at how compression is implemented in fts3 and wanting the
same for fts5 I managed to get a working implementation that I'm sharing
here with the same license as sqlite3 in hope it can be useful to others
and maybe be added to sqlite3.

Cheers !


Here is on implementation of optional compression and min_word_size for
columns in fts5:

===

create virtual table if not exists docs_fts using fts5(
  doc_fname unindexed, doc_data compressed,
  compress=compress, uncompress=uncompress,
  tokenize = 'unicode61 min_word_size=3'
);

===

https://gist.github.com/mingodad/7fdec8eebdde70ee388db60855760c72


And here is an implementation of optional compression for columns in
fts3/4:

===

create virtual table if not exists docs_fts using fts4(
  doc_fname, doc_data,
  tokenize = 'unicode61',
  notindexed=doc_fname, notcompressed=doc_fname,
  compress=compress, uncompress=uncompress
);

===

https://gist.github.com/mingodad/2f05cd1280d58f93f89133b2a2011a4d

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Compression for ft5

2018-09-25 Thread Wout Mertens
This is really cool, thanks for sharing!

I wonder though, is the compression done per field? I read the source but I
couldn't figure it out quickly (not really used to the sqlite codebase).
What are the compression ratios you achieve?


Wout.


On Mon, Sep 24, 2018 at 3:58 PM Domingo Alvarez Duarte 
wrote:

> Hello !
>
> After looking at how compression is implemented in fts3 and wanting the
> same for fts5 I managed to get a working implementation that I'm sharing
> here with the same license as sqlite3 in hope it can be useful to others
> and maybe be added to sqlite3.
>
> Cheers !
>
>
> Here is on implementation of optional compression and min_word_size for
> columns in fts5:
>
> ===
>
> create virtual table if not exists docs_fts using fts5(
>  doc_fname unindexed, doc_data compressed,
>  compress=compress, uncompress=uncompress,
>  tokenize = 'unicode61 min_word_size=3'
> );
>
> ===
>
> https://gist.github.com/mingodad/7fdec8eebdde70ee388db60855760c72
>
>
> And here is an implementation of optional compression for columns in
> fts3/4:
>
> ===
>
> create virtual table if not exists docs_fts using fts4(
>  doc_fname, doc_data,
>  tokenize = 'unicode61',
>  notindexed=doc_fname, notcompressed=doc_fname,
>  compress=compress, uncompress=uncompress
> );
>
> ===
>
> https://gist.github.com/mingodad/2f05cd1280d58f93f89133b2a2011a4d
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] compression

2005-09-28 Thread Jay Sprenkle
On 9/28/05, Dennis Jenkins <[EMAIL PROTECTED]> wrote:
>
>
> This is from my Gentoo 2005.1 Linux system (home) with whatever packages
> it installed a few days ago.
> At work I use FreeBSD and I've not used loopback devices there, but the
> FreeBSD Handbook (online for free) shows how to do it.
>
>
> I'm using gentoo at home too so that won't be problem :)



--
---
The Castles of Dereth Calendar: a tour of the art and architecture of
Asheron's Call
http://www.lulu.com/content/77264


Re: [sqlite] compression

2005-09-28 Thread Dennis Jenkins

Jay Sprenkle wrote:


On 9/28/05, Dennis Jenkins <[EMAIL PROTECTED]> wrote:
 


Your third statement is not true. On Linux (and FreeBSD, but FreeBSD
does not have Reiser as far as I know) you can treat a regular file as
if it were a filesystem and mount that fiel system via the "loop back"
device. You can mount an ISO image file as an actual CD, for instance.

   



Cool! Thanks for letting us know :)

 

I should have mentioned the obvious though... the file must be an image 
of a valid file system.


for example, the following will fail:

dd if=/dev/zero of=file bs=4096 count=1024
losetup /dev/loop0 file
mount /dev/loop0 /mnt/xxx

However, the following should work:
dd if=/dev/zero of=file bs=4096 count=1024
losetup /dev/loop0 file
mke2fs /dev/loop0
mount /dev/loop0 /mnt/xxx


You can even encrypt the entire filesystem over loop back:
dd if=/dev/zero of=blob bs=1M count=1024
losetup -e AES256 /dev/loop0 blob
mke3fs /dev/loop0 blob
mount /dev/loop0 /mnt/crypto

as usual, do a "man" on "losetup". 

This is from my Gentoo 2005.1 Linux system (home) with whatever packages 
it installed a few days ago.
At work I use FreeBSD and I've not used loopback devices there, but the 
FreeBSD Handbook (online for free) shows how to do it.





Re: [sqlite] compression

2005-09-28 Thread Dennis Jenkins

Christian Smith wrote:


On Wed, 28 Sep 2005, Sid Liu wrote:

 


Is there a possibility that this Reiser 4 be used on a
file, rather than a file system? Hopefully on Windows?
   




Reiser FS is a filesystem. It manages files. So it cannot be used on a
file.

 

Your third statement is not true.  On Linux (and FreeBSD, but FreeBSD 
does not have Reiser as far as I know) you can treat a regular file as 
if it were a filesystem and mount that fiel system via the "loop back" 
device.  You can mount an ISO image file as an actual CD, for instance.  
Years ago I imaged all of my old DOS floppies.  I access them via the 
loop back device now.  In theory, you can do that with any file system 
that can use a block device (ntfs, iso9660, ext3, etc...) but not with 
nfs, smbfs, proc, etc...


dd if=/dev/fd0 of=floppy_file.img
# Eject floppy, don't need it anymore.

losetup /dev/loop0 floppy_file.img
mount -t vfat /dev/loop0 /mnt/floppy
ls -l /mnt/floppy



Windows NTFS already has compressed files. Right click a file or directory
in exporer, select propeties, then advanced attributes. You can turn on
compression there. Don't know how to do it from the command line, though.

 





Re: [sqlite] compression

2005-09-28 Thread Christian Smith
On Wed, 28 Sep 2005, Sid Liu wrote:

>Is there a possibility that this Reiser 4 be used on a
>file, rather than a file system? Hopefully on Windows?


Reiser FS is a filesystem. It manages files. So it cannot be used on a
file.

Windows NTFS already has compressed files. Right click a file or directory
in exporer, select propeties, then advanced attributes. You can turn on
compression there. Don't know how to do it from the command line, though.


>
>--- Jay Sprenkle <[EMAIL PROTECTED]> wrote:
>
>> If you're on Linux read about the Reiser 4 file
>> system.
>> They found they could compress the entire file
>> system on the fly and achieve
>> higher performance as well. Most CPU's can compress
>> and move data faster
>> because they make up the difference on the slow I/O
>> channels to hard disks.
>> Might be a much easier solution
>>
>>
>> On 9/24/05, Martin Pfeifle <[EMAIL PROTECTED]>
>> wrote:
>> >
>> > Hello,
>> > does anybody know whether it is possible to
>> compress
>> > sqlite data on the page level. If I compress the
>> > sqlite database file with zlib I get very high
>> > compression rates due to the character of the
>> stored
>> > data.
>> > I think this problem is related to the problem of
>> > using encrypted databases. Perheps it is possible
>> just
>> > to exchange the encryption function call by a zlib
>> > compression call.
>> > Integrating such a call simply into the read and
>> write
>> > functions in the file os_win.c does not work.
>> > Can anybody help me, or give me a hint?
>> > Ciao Martin
>> >
>> >
>> >
>> >
>>
>___
>> > Was denken Sie über E-Mail? Wir hören auf Ihre
>> Meinung:
>> > http://surveylink.yahoo.com/wix/p0379378.aspx
>> >
>>
>>
>>
>> --
>> ---
>> The Castles of Dereth Calendar: a tour of the art
>> and architecture of
>> Asheron's Call
>> http://www.lulu.com/content/77264
>>
>
>

-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


Re: [sqlite] compression

2005-09-28 Thread Sid Liu
Is there a possibility that this Reiser 4 be used on a
file, rather than a file system? Hopefully on Windows?

--- Jay Sprenkle <[EMAIL PROTECTED]> wrote:

> If you're on Linux read about the Reiser 4 file
> system.
> They found they could compress the entire file
> system on the fly and achieve
> higher performance as well. Most CPU's can compress
> and move data faster
> because they make up the difference on the slow I/O
> channels to hard disks.
> Might be a much easier solution
> 
> 
> On 9/24/05, Martin Pfeifle <[EMAIL PROTECTED]>
> wrote:
> >
> > Hello,
> > does anybody know whether it is possible to
> compress
> > sqlite data on the page level. If I compress the
> > sqlite database file with zlib I get very high
> > compression rates due to the character of the
> stored
> > data.
> > I think this problem is related to the problem of
> > using encrypted databases. Perheps it is possible
> just
> > to exchange the encryption function call by a zlib
> > compression call.
> > Integrating such a call simply into the read and
> write
> > functions in the file os_win.c does not work.
> > Can anybody help me, or give me a hint?
> > Ciao Martin
> >
> >
> >
> >
>
___
> > Was denken Sie über E-Mail? Wir hören auf Ihre
> Meinung:
> > http://surveylink.yahoo.com/wix/p0379378.aspx
> >
> 
> 
> 
> --
> ---
> The Castles of Dereth Calendar: a tour of the art
> and architecture of
> Asheron's Call
> http://www.lulu.com/content/77264
> 



Re: [sqlite] compression

2005-09-27 Thread Jay Sprenkle
If you're on Linux read about the Reiser 4 file system.
They found they could compress the entire file system on the fly and achieve
higher performance as well. Most CPU's can compress and move data faster
because they make up the difference on the slow I/O channels to hard disks.
Might be a much easier solution


On 9/24/05, Martin Pfeifle <[EMAIL PROTECTED]> wrote:
>
> Hello,
> does anybody know whether it is possible to compress
> sqlite data on the page level. If I compress the
> sqlite database file with zlib I get very high
> compression rates due to the character of the stored
> data.
> I think this problem is related to the problem of
> using encrypted databases. Perheps it is possible just
> to exchange the encryption function call by a zlib
> compression call.
> Integrating such a call simply into the read and write
> functions in the file os_win.c does not work.
> Can anybody help me, or give me a hint?
> Ciao Martin
>
>
>
> ___
> Was denken Sie über E-Mail? Wir hören auf Ihre Meinung:
> http://surveylink.yahoo.com/wix/p0379378.aspx
>



--
---
The Castles of Dereth Calendar: a tour of the art and architecture of
Asheron's Call
http://www.lulu.com/content/77264


Re: [sqlite] Compression

2005-01-03 Thread amead
[EMAIL PROTECTED] wrote:
Hello all,
First of all, allow me to wish everyone a Happy New Year and I hope it'll be a 
good one for all.
My question is (and I've raised this topic back in September, but didn't get 
back to it since), does anyone have a free/commercial add-on for SQLite v3 to 
perform on-the-fly compression/decompression of data, preferably on a field 
level (compress just one of the fields, not the whole table)?
Thank you,
  Dennis
 

I had a file size problem so I considered this.  My googling didn't turn 
up any solutions and then, upon further thought, I decided that this 
probably wouldn't work for most applications.  I think the data would 
have to be compressed on a field basis and in that case, I think you 
would only get good compression on fairly long fields and then only if 
you didn't ever want to use those in a query.  (I have to assume that 
decompressing fields in order to find out whether they match a where 
condition would be deathly slow... I assume that it would cause a lot of 
problems with indexing as well...) 

So, under those conditions, it seems like you could move the 
compression/decompression out of SQLite and into your program...  
compress strings before you write them to the database; decompress 
strings after you're retrieved them.

I may have a need like this.  However, the original problem didn't fit 
this structure (no long strings), so I never pursued this...

-Alan
--
Alan Mead - [EMAIL PROTECTED]
People often find it easier to be a result of the past than a cause of
the future.


Re: [sqlite] Compression

2005-01-01 Thread info
Well, actually that's exactly what I need - compression of large fields, not 
the whole database.

  Dennis
// MCP, MCSD
// ASP Developer Member
// Software for animal shelters!
// www.smartpethealth.com
// www.amazingfiles.com
- Original Message - 
From: <[EMAIL PROTECTED]>
To: 
Sent: Sunday, January 02, 2005 12:42 PM
Subject: Re: [sqlite] Compression


Compression in the DB is interesting I think the commercial prod mentioned
just does a field compress and that is all. In general this only works on
larger blob like fiels as the overhead of the compressor is usually
somewhat high and lets not forget extra overhead of comp/decomp. The idea
I was playing with a while back (zlib) was a global db dictionary for
compression, but as memory got cheep and larger I dropped the project. The
simple token compression (like the old days of faircom's btree package,
sybase IQ, Monet, etc are much nicer). It would be a neat feature.
Sandy
My question is (and I've raised this topic back in September, but
didn't get back to it since), does anyone have a free/commercial
add-on for SQLite v3 to perform on-the-fly compression/decompression
of data, preferably on a field level (compress just one of the fields,
not the whole table)?
Related to this I would love to see reference counting of values.
For example if I add the string "foobar" in 27 different places,
it only gets stored once with a reference count of 27.
There are various places that have done compression:
  http://www.sqliteplus.com/
There is also mention of compression at
  http://www.hwaci.com/sw/sqlite/prosupport.html
If you are working on a commercial product and SQLite has made your
product better and/or improved your development process then it is
fair and worthwhile to pay for that.
Roger





Re: [sqlite] Compression

2005-01-01 Thread sganz
Compression in the DB is interesting I think the commercial prod mentioned
just does a field compress and that is all. In general this only works on
larger blob like fiels as the overhead of the compressor is usually
somewhat high and lets not forget extra overhead of comp/decomp. The idea
I was playing with a while back (zlib) was a global db dictionary for
compression, but as memory got cheep and larger I dropped the project. The
simple token compression (like the old days of faircom's btree package,
sybase IQ, Monet, etc are much nicer). It would be a neat feature.

Sandy

>> My question is (and I've raised this topic back in September, but
>> didn't get back to it since), does anyone have a free/commercial
>> add-on for SQLite v3 to perform on-the-fly compression/decompression
>> of data, preferably on a field level (compress just one of the fields,
>> not the whole table)?
>
> Related to this I would love to see reference counting of values.
> For example if I add the string "foobar" in 27 different places,
> it only gets stored once with a reference count of 27.
>
> There are various places that have done compression:
>
>   http://www.sqliteplus.com/
>
> There is also mention of compression at
>
>   http://www.hwaci.com/sw/sqlite/prosupport.html
>
> If you are working on a commercial product and SQLite has made your
> product better and/or improved your development process then it is
> fair and worthwhile to pay for that.
>
> Roger
>



Re: [sqlite] Compression

2005-01-01 Thread info
Thank you Roger,
I'm not against paying (that's why I said "commercial").
Have you (or anyone) used SQLite++? What are your thoughts on it?
Thank you,
  Dennis
// MCP, MCSD
// ASP Developer Member
// Software for animal shelters!
// www.smartpethealth.com
// www.amazingfiles.com
- Original Message - 
From: "Roger Binns" <[EMAIL PROTECTED]>
To: 
Sent: Sunday, January 02, 2005 11:18 AM
Subject: Re: [sqlite] Compression


My question is (and I've raised this topic back in September, but 
didn't get back to it since), does anyone have a free/commercial 
add-on for SQLite v3 to perform on-the-fly compression/decompression 
of data, preferably on a field level (compress just one of the fields, 
not the whole table)?
Related to this I would love to see reference counting of values.
For example if I add the string "foobar" in 27 different places,
it only gets stored once with a reference count of 27.
There are various places that have done compression:
 http://www.sqliteplus.com/
There is also mention of compression at
 http://www.hwaci.com/sw/sqlite/prosupport.html
If you are working on a commercial product and SQLite has made your
product better and/or improved your development process then it is
fair and worthwhile to pay for that. 

Roger



Re: [sqlite] Compression

2005-01-01 Thread Roger Binns
My question is (and I've raised this topic back in September, but 
didn't get back to it since), does anyone have a free/commercial 
add-on for SQLite v3 to perform on-the-fly compression/decompression 
of data, preferably on a field level (compress just one of the fields, 
not the whole table)?
Related to this I would love to see reference counting of values.
For example if I add the string "foobar" in 27 different places,
it only gets stored once with a reference count of 27.
There are various places that have done compression:
 http://www.sqliteplus.com/
There is also mention of compression at
 http://www.hwaci.com/sw/sqlite/prosupport.html
If you are working on a commercial product and SQLite has made your
product better and/or improved your development process then it is
fair and worthwhile to pay for that. 

Roger