Hi Joe, If you run a debian based distro: apt-get build-dep tracker. Else in this case you need to install a package which is usually called gobject-introspection.
ps. for the office files you'll need GFS, libgfs-1-dev or something. Kind regards, Philip On Mon, 2016-01-11 at 21:33 -0500, Joe Rhodes wrote: > Carlos, et. al., > > I'm sorry, but I cannot seem to build the master branch right now. I ran the > autogen.sh script and then configure dies on me with this: > > checking for pkg-config... /usr/bin/pkg-config > checking pkg-config is at least version 0.16... yes > ./configure: line 19136: syntax error near unexpected token `0.9.5' > ./configure: line 19136: `GOBJECT_INTROSPECTION_CHECK(0.9.5)' > > > I'm not entirely sure what's going on there. (Sorry, programming is not my > forte.) I'll have to wait for 1.7.2 and give that a try. I can only work > on this in the evenings when I'm not at work and the server thats housing all > of this data is otherwise not terribly busy. > > Cheers! > -Joe Rhodes > > > > > > On Jan 11, 2016, at 5:21 AM, Philip Van Hoof <phi...@codeminded.be> wrote: > > > > Hi Carlos, > > > > Looks like my git-account has been closed on GNOME, so here is a patch > > for one of the issues in that valgrind. > > > > > > Kind regards, > > > > Philip > > > > On Sun, 2016-01-10 at 16:05 -0500, Joe Rhodes wrote: > >> Carlos: > >> > >> Yes, there are a LOT of files on this volume. The makeup of the 5 TB of > >> data is PDFs, Photoshop files, Word docs, InDesign & Illustrator docs. > >> There are very few large files like MP3's or videos. If I disable all > >> the extractors and just build an index based on file names, I get an index > >> of about 3 GB. > >> > >> I did notice that I was possibly indexing all of my snapshots of my > >> volumes. I'm using ZFS and they're available under "/volume/.zfs". I've > >> added that folder to my list of excluded directories: > >> > >> org.freedesktop.Tracker.Miner.Files ignored-directories ['.zfs', 'ZZZ > >> Snapshots', 'po', 'CVS', 'core-dumps', 'lost+found'] > >> > >> I'll see if that makes any difference. If it was digging into those, that > >> would greatly increase the number of files. > >> > >> I'm not entirely sure how to start tracker with the valgrind command. > >> Tracker is currently started automatically by the Netatalk file server > >> process. In order to run the tracker processes, I have to execute the > >> following: > >> > >> PREFIX="/main-storage" > >> export XDG_DATA_HOME="$PREFIX/var/netatalk/" > >> export XDG_CACHE_HOME="$PREFIX/var/netatalk/" > >> export > >> DBUS_SESSION_BUS_ADDRESS="unix:path=$PREFIX/var/netatalk/spotlight.ipc" > >> /usr/local/bin/tracker daemon -t > >> > >> So after stopping the daemon, I just started tried the following: > >> > >> valgrind --leak-check=full --log-file=valgrind-tracker-extract-log > >> --num-callers=30 /usr/local/libexec/tracker-extract > >> valgrind --leak-check=full --log-file=valgrind-tracker-miner-fs-log > >> --num-callers=30 /usr/local/libexec/tracker-miner-fs > >> > >> Hopefully that will get you want you want? > >> > >> I've uploaded the log files files to DropBox. Hopefully you can easily > >> grab those without having to jump through too many hoops. > >> > >> https://www.dropbox.com/s/o3w10hnaa6ikvn3/valgrind-tracker-extract-log.gz?dl=0 > >> https://www.dropbox.com/s/5s4vqk0owrf5gjd/valgrind-tracker-miner-fs-log.gz?dl=0 > >> > >> I let them run for a bit. I could definitely see RAM usage start to > >> climb. I didn't bother to let it go to GB's in size. I think I was about > >> about 300MB when I hit Ctl-C. > >> > >> Cheers! > >> -Joe Rhodes > >> > >> > >>> On Jan 10, 2016, at 2:25 PM, Carlos Garnacho <carl...@gnome.org> wrote: > >>> > >>> Hi Joe, > >>> > >>> On Sun, Jan 10, 2016 at 6:40 PM, Joe Rhodes <li...@joerhodes.com> wrote: > >>>> I have just compiled and installed tracker-1.7.1 on a CentOS 7.1 box. I > >>>> just used the default configuration ("./configure" with no additional > >>>> options). I'm indexing around 5 TB of data. I'm noticing that both the > >>>> tracker-extract and tracker-miner-fs processes are using a large > >>>> amount > >>>> of RAM. The tracker-extract process is currently using 11 GB of RAM (RES > >>>> not VIRT as reported by top), while the tracker-miner-fs is sitting at > >>>> 4.5 > >>>> GB. > >>>> > >>>> Both processes start out modestly, but continue to grow as they do their > >>>> work. The tracker-miner-fs levels off at 4.5 GB once it appears to have > >>>> finished crawling the entire volume. (Once the CPU usage goes back down > >>>> to > >>>> near 0.) The tracker-extract process also continues to grow as it > >>>> works. > >>>> Once it is done, it levels off. Last time it stayed at about 9 GB. > >>>> > >>>> If I restart tracker (with: 'tracker daemon -t' followed by 'tracker > >>>> daemon > >>>> -s') a similar thing will happen with tracker-miner-fs. It will grow > >>>> back > >>>> to 4.5 GB as it crawls its way across the entire volume. The > >>>> tracker-extract process though, because all of the files were just > >>>> indexed > >>>> and it doesn't need to do much, uses a very modest amount of RAM. I don't > >>>> have that number right now because I'm re-indexing the entire volume, but > >>>> it's well below 100 MB. > >>>> > >>>> Is this expected behaviour? Or is there a memory leak? Or perhaps > >>>> tracker > >>>> just isn't designed to operate on this large of a volume? > >>> > >>> It totally sounds like a memory leak, although it sounds strange that > >>> it hits both tracker-miner-fs and tracker-extract. > >>> > >>> There is obviously an impact to running Tracker on large directory > >>> trees, such as: > >>> > >>> - Possibly exhausted inotify handles, the directories we fail to > >>> create a monitor for would just be checked/updated on next miner > >>> startup > >>> - More (longer, rather) IO/CPU usage during startup, because the miner > >>> has to check mtimes for all directories and files > >>> - The miner also needs to keep an in-memory representation of the > >>> directory tree for accounting purposes (file monitors, etc). Regular > >>> files are represented in this model only as long as they're being > >>> checked/processed, and disappear soon after. This might account for a > >>> memory peak at startup, if there's many items left to process, because > >>> Tracker dumps files into processing queues ASAP, but I think the > >>> memory usage should be nowhere as big. > >>> > >>> So I think nothing accounts for such memory usage in tracker-miner-fs, > >>> the only known source of unbound memory growth is the number of > >>> directories (and regular files for the peak at startup) to be indexed, > >>> but you would need millions of those to have tracker-miner-fs grow up > >>> to 4.5GB. > >>> > >>> And tracker-extract has a much shorter memory, it just checks the > >>> files that need extraction in small batches, and processes those one > >>> by one before querying the next batch. 9GB shout memory leak, we've > >>> had other memory leak situations in tracker-extract, and the culprit > >>> most often is in the various libraries we're using in our extract > >>> modules, if many files end up triggering that module (and the leaky > >>> code path in the specific library), the effect will accumulate over > >>> time. > >>> > >>> The downside of this situation is that most often we Tracker > >>> developers can't reproduce unless we have a file that triggers the > >>> leak so we can fix it or channel to the appropriate maintainers, so it > >>> would be great if you could provide valgrind logs, just run as: > >>> > >>> valgrind --leak-check=full --log-file=valgrind-log --num-callers=30 > >>> /path/to/built/tracker-extract > >>> > >>> Hit ctrl-C when enough time has passed, and send back the valgrind-log > >>> file. Same applies to tracker-miner-fs. > >>> > >>>> > >>>> My tracker meta.db file is about 13 GB right now, though still growing. > >>>> I > >>>> suspect it's close to indexed though. > >>> > >>> This is also suspicious, you again need either a hideous amount of > >>> files to have meta.db grow as large, or an equally hideous amount of > >>> plain text content that gets indexed. Out of curiosity, how many > >>> directories/files does that partition contain? is the content > >>> primarily video/documents/etc? > >>> > >>> Cheers, > >>> Carlos > >> > >> _______________________________________________ > >> tracker-list mailing list > >> tracker-list@gnome.org > >> https://mail.gnome.org/mailman/listinfo/tracker-list > > > > <0001-Fix-small-memory-leak.patch> >
signature.asc
Description: This is a digitally signed message part
_______________________________________________ tracker-list mailing list tracker-list@gnome.org https://mail.gnome.org/mailman/listinfo/tracker-list