* there are some errors in links of files and special pages
examples
קובץ:Nuvola_apps_important.svg<http://commons.wikimedia.org/wiki/File:Nuvola_apps_important.svg>
link
to ויקיפדיה:מיזמי ויקיפדיה/מיזם ערכים ללא תמונות/קטגוריות/ספורטאים
איטלקים(wikipedia:wikipedia projects\ articles without
images\categories\Sports
people from Italy)
מיוחד:אקראי (Special:Random) > 15 במאי (may 15)
מיוחד:שינויים אחרונים (Special:RecentChanges) > 10_באוגוסט

* size is important because we intend to add images

2009/7/6 <[email protected]>

> Send dev-l mailing list submissions to
>        [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        https://intern.openzim.org/mailman/listinfo/dev-l
> or, via email, send a message with subject or body 'help' to
>        [email protected]
>
> You can reach the person managing the list at
>        [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of dev-l digest..."
>
>
> Today's Topics:
>
>   1. Kiwix index size (Asaf Bartov)
>   2. Re: Kiwix index size (Manuel Schneider)
>   3. Re: Kiwix index size (Emmanuel Engelhart)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 5 Jul 2009 19:18:57 +0300
> From: Asaf Bartov <[email protected]>
> Subject: [openZIM dev-l] Kiwix index size
> To: [email protected]
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi, everyone.
>
> When running Kiwix's indexer on the ZIM file I had created from the Hebrew
> Wikipedia last week, the Kiwix data directory ran up to a total of 31
> items,
> totalling 2.3 GB.  The ZIM file itself is ~300MB.  Does this proportion
> make
> sense?
>
> Detailed ls output attached.
>
> Thanks in advance,
>
>   Asaf Bartov
> --
> Asaf Bartov <[email protected]>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://intern.openzim.org/pipermail/dev-l/attachments/20090705/2afee878/attachment.html
> >
> -------------- next part --------------
> ro...@desktop:~/.www.kiwix.org/kiwix$ ls -l -h -a -R
> .:
> total 16K
> drwx------ 3 rotem rotem 4.0K 2009-07-01 16:10 .
> drwx------ 3 rotem rotem 4.0K 2009-07-01 16:10 ..
> drwx------ 4 rotem rotem 4.0K 2009-07-05 19:00 7680jxd5.default
> -rw-r--r-- 1 rotem rotem   94 2009-07-01 16:10 profiles.ini
>
> ./7680jxd5.default:
> total 1.7M
> drwx------ 4 rotem rotem 4.0K 2009-07-05 19:00 .
> drwx------ 3 rotem rotem 4.0K 2009-07-01 16:10 ..
> drwxr-xr-x 2 rotem rotem 4.0K 2009-07-02 05:13
> 31c26198d06ad265677b450796cc09aa.index
> -rw------- 1 rotem rotem  162 2009-07-05 18:19 compatibility.ini
> -rw-r--r-- 1 rotem rotem 135K 2009-07-05 18:19 compreg.dat
> drwxr-xr-x 2 rotem rotem 4.0K 2009-07-01 16:10 extensions
> -rw-r--r-- 1 rotem rotem  169 2009-07-01 16:10 localstore.rdf
> -rw-r--r-- 1 rotem rotem  304 2009-07-05 18:39 mimeTypes.rdf
> -rw-r--r-- 1 rotem rotem    0 2009-07-05 18:40 .parentlock
> -rw-r--r-- 1 rotem rotem 2.0K 2009-07-01 16:10 permissions.sqlite
> -rw-r--r-- 1 rotem rotem 128K 2009-07-05 18:54 places.sqlite
> -rw------- 1 rotem rotem  951 2009-07-05 19:00 prefs.js
> -rw-r--r-- 1 rotem rotem 1.1M 2009-07-05 18:20 XPC.mfasl
> -rw-r--r-- 1 rotem rotem  98K 2009-07-05 18:19 xpti.dat
> -rw-r--r-- 1 rotem rotem  98K 2009-07-05 18:20 XUL.mfasl
>
> ./7680jxd5.default/31c26198d06ad265677b450796cc09aa.index:
> total 2.4G
> drwxr-xr-x 2 rotem rotem 4.0K 2009-07-02 05:13 .
> drwx------ 4 rotem rotem 4.0K 2009-07-05 19:00 ..
> -rw-r--r-- 1 rotem rotem    0 2009-07-02 01:46 flintlock
> -rw-r--r-- 1 rotem rotem   12 2009-07-02 01:46 iamflint
> -rw-r--r-- 1 rotem rotem  22K 2009-07-02 05:13 position.baseA
> -rw-r--r-- 1 rotem rotem  21K 2009-07-02 05:10 position.baseB
> -rw-r--r-- 1 rotem rotem 1.4G 2009-07-02 05:13 position.DB
> -rw-r--r-- 1 rotem rotem  12K 2009-07-02 05:13 postlist.baseA
> -rw-r--r-- 1 rotem rotem  12K 2009-07-02 05:10 postlist.baseB
> -rw-r--r-- 1 rotem rotem 754M 2009-07-02 05:13 postlist.DB
> -rw-r--r-- 1 rotem rotem   70 2009-07-02 05:13 record.baseA
> -rw-r--r-- 1 rotem rotem   70 2009-07-02 05:10 record.baseB
> -rw-r--r-- 1 rotem rotem 3.3M 2009-07-02 05:13 record.DB
> -rw-r--r-- 1 rotem rotem 4.4K 2009-07-02 05:13 termlist.baseA
> -rw-r--r-- 1 rotem rotem 4.3K 2009-07-02 05:10 termlist.baseB
> -rw-r--r-- 1 rotem rotem 278M 2009-07-02 05:13 termlist.DB
> -rw-r--r-- 1 rotem rotem  232 2009-07-02 05:13 value.baseA
> -rw-r--r-- 1 rotem rotem  230 2009-07-02 05:10 value.baseB
> -rw-r--r-- 1 rotem rotem  14M 2009-07-02 05:13 value.DB
>
> ./7680jxd5.default/extensions:
> total 8.0K
> drwxr-xr-x 2 rotem rotem 4.0K 2009-07-01 16:10 .
> drwx------ 4 rotem rotem 4.0K 2009-07-05 19:00 ..
> ro...@desktop:~/.www.kiwix.org/kiwix$
>
> ------------------------------
>
> Message: 2
> Date: Sun, 5 Jul 2009 20:57:39 +0200
> From: Manuel Schneider <[email protected]>
> Subject: Re: [openZIM dev-l] Kiwix index size
> To: [email protected], [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain;  charset="utf-8"
>
> Hi Asaf,
>
> Am Sonntag, 5. Juli 2009 schrieb Asaf Bartov:
> > When running Kiwix's indexer on the ZIM file I had created from the
> Hebrew
> > Wikipedia last week, the Kiwix data directory ran up to a total of 31
> > items, totalling 2.3 GB.  The ZIM file itself is ~300MB.  Does this
> > proportion make sense?
>
> I am not sure about the other files which were created, you only need the
> ZIM
> file with the index itself.
>
> For 900'000 articles the ZIM file containing the articles was 1.4 GB, the
> Index ZIM was 1.0 GB.
>
> So I think 300 MB looks fine.
>
> Greets,
>
>
> Manuel
> --
> Regards
> Manuel Schneider
>
> Wikimedia CH - Verein zur F?rderung Freien Wissens
> Wikimedia CH - Association for the advancement of free knowledge
> www.wikimedia.ch
>
>
> ------------------------------
>
> Message: 3
> Date: Sun, 05 Jul 2009 21:05:33 +0200
> From: Emmanuel Engelhart <[email protected]>
> Subject: Re: [openZIM dev-l] Kiwix index size
> To: [email protected], [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Asaf
> Asaf Bartov a ?crit :
> > When running Kiwix's indexer on the ZIM file I had created from the
> Hebrew
> > Wikipedia last week, the Kiwix data directory ran up to a total of 31
> items,
> > totalling 2.3 GB.  The ZIM file itself is ~300MB.  Does this proportion
> make
> > sense?
>
> this is possible. Kiwix uses the Xapian search engine which generates
> pretty big index files.
>
> I have to questions:
> * Are the search results OK?
> * Do you have a problem with the size of the index? Do you have a size
> limit?
>
> They are many open search/index softwares. I choose to use Xapian for
> many reasons, but this is possible under certain condition to add to
> Kiwix the support to an another search engine. This should be also
> possible to make a modified version of the indexer using less disk space
> (but with less words indexed).
>
> OpenZIM itself provides a search solution, Tommi can explain you more
> about it. Maybe it would be interesting for you to test it and give us a
>  feedback!
>
> Regards
> Emmanuel
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkpQ+XcACgkQn3IpJRpNWtPm8wCfcmzwRfg6/9ttuknkURF7ct5I
> JLAAoLbVJWqXUKIeh8Mpua3GD+bjI5ZD
> =RH/U
> -----END PGP SIGNATURE-----
>
>
> ------------------------------
>
> _______________________________________________
> dev-l mailing list
> [email protected]
> https://intern.openzim.org/mailman/listinfo/dev-l
>
>
> End of dev-l Digest, Vol 5, Issue 2
> ***********************************
>


-- 
Rotem Simha
_______________________________________________
dev-l mailing list
[email protected]
https://intern.openzim.org/mailman/listinfo/dev-l

Reply via email to