> From: Sebastian Trüg <[email protected]> > Hi guys, > ... > ... It is just that we try to keep data that can be extracted from files > separate from the rest, i.e. have it in one graph which can easily be removed > and recreated. That is what the Strigi indexer service does. > ... > Resource::setProperty overwrites any existing triple with that > subject/property pair.
Ahh. I think I see now... So if Bangarang creates the data in nepomuk before Strigi does, Strigi may subsequently create a separate index graph containing the same extractable data. If Strigi creates it first then Bangarang updates the existing data. > The problem here is that Bangarang uses Taglib which cannot be used in > Strigi. > Strigi is stream-based while Taglib is not. This is a well known and old > problem which leads to so much rewriting of code for Strigi. > So converting the Banganrang analyzer to lsa is not an option, at least not > until the latter becomes a non-stream-based API. > ouch. > There is no duplication of data at the moment. The only thing that needs > fixing if I saw correctly is that Banganrang does use plain strings for > artists instead of nco:Contact resources. In KDE SC 4.3 Bangarang uses the xesam ontology which defines just a string for the xesam:artist property. But for 4.4 it should be creating an nco:creator property pointing to an nco:Contact resource and then setting nco:fullName on the nco:Contact. If, it's not doing that then I need to fix it. :-) > This is another problem (not of Banganrang but lsa): it would be good to > reuse > these contacts instead of recreating them everytime. The latter results in a > lot of Contact resources which confuses KDEPIM. Hmm. I probably don't have enough background on the topic, but I wonder whether PIMO might not be better for media artist/performer/composer/etc. metadata. I've always imagined that nco:Contact in the desktop context as a person/company that I actually has some kind of contact with, instead of a place holder for any person/group. I love Muse, System of a Down and Bob Marley, but, much as I'd like to, I don't think I know them well enough for them to be in my address book. Just a (perhaps misguided) thought... > The Strigi service does create the one graph which is > marked as the index graph for that particular file[1]. This graph contains > only data that can be recreated by re-indexing the file. So in theory > Bangarang would need to add its own graph with data only extracted from the > file. But then we need to sync that data. If we were to put it in the same > graph that is used by Strigi then Strigi would delete that data again on > update. Also not a perfect solution. But maybe better since media files > almost > never change.... Perhaps another work-around could be for apps working with file resources in nepomuk to always give Strigi a chance to create the nepomuk data first. Bangarang could ask Strigi to index any file it opens. After that it can continue to work as it currrently does: Updating existing triples, adding new triples for custom metadata (Strigi shouldn't delete these on re-indexing right?) Longer term we could work on the plugin issue for extending strigi indexing capabilities. Hope this helps, Andrew _______________________________________________ Nepomuk mailing list [email protected] https://mail.kde.org/mailman/listinfo/nepomuk
