Hi Christian, let's meet up this week to discuss the problem and hopefully fix it. So far I stayed clean of the storeResources code but with Vishesh not having much time I will dive into it.
Cheers, Sebastian On 10/31/2011 12:42 PM, Christian Mollekopf wrote: > Hey, > > This issue starts to get pressing, a solution is needed for 4.8. > Currently the feeders are broken because of that issue. > > The code in storeResources is beyond me and my attempts to fix it failed so > far. So if no one fixes it there I'll have to work around the issue in the > feeder code. > > I don't mean to push anyone, I'd just like to know if somebody from the > nepomuk team (yes vishesh I'm looking at you ;-) is going to fix this, or if > I'm on my own. As said, I do understand if you currently lack the time to > make > this happen, just tell me. > > Thanks, > Christian > > PS: I added the pastes before they are deleted from pastie > > On Saturday, October 08, 2011 03:12:51 PM Christian Mollekopf wrote: >> Hi Vishesh, >> >> The duplicates merging code doesn't cut it for the feeders yet. >> As far as I could track it down the problem is that I have hierarchies of >> resources which need to be merged together. >> I.e. I add a contact with it's email address several times to the graph. The >> email addresses are now correctly merged, but because the contacts had >> different email uris in the first hashing run (before they have been >> merged), the contacts remain duplicated. >> >> Here is the test which currently fails: >> http://paste.kde.org/131371/ > > void DataManagementModelTest::testStoreResources_duplicates2() > { > SimpleResource contact1; > contact1.addType( NCO::Contact() ); > contact1.addProperty( NCO::fullname(), QLatin1String("Spiderman") ); > contact1.addProperty( NAO::prefLabel(), QLatin1String("test") ); > > SimpleResource email1; > email1.addType(NCO::EmailAddress()); > email1.addProperty(NCO::emailAddress(), QLatin1String("[email protected]")); > contact1.addProperty(NCO::hasEmailAddress(), email1.uri()); > > SimpleResource contact2; > contact2.addType( NCO::Contact() ); > contact2.addProperty( NCO::fullname(), QLatin1String("Spiderman") ); > contact2.addProperty( NAO::prefLabel(), QLatin1String("test") ); > > SimpleResource email2; > email2.addType(NCO::EmailAddress()); > email2.addProperty(NCO::emailAddress(), QLatin1String("[email protected]")); > contact2.addProperty(NCO::hasEmailAddress(), email2.uri()); > > SimpleResourceGraph graph; > graph << email1 << contact1 << email2 << contact2; > > m_dmModel->storeResources( graph, "appA" ); > QVERIFY(!m_dmModel->lastError()); > > int contactCount = m_model->listStatements( Node(), RDF::type(), > NCO::Contact() ).allStatements().size(); > QCOMPARE( contactCount, 1 ); > > int emailCount = m_model->listStatements( Node(), RDF::type(), > NCO::EmailAddress() ).allStatements().size(); > QCOMPARE( emailCount, 1 ); > > QCOMPARE( m_model->listStatements( Node(), NCO::fullname(), Node() > ).allStatements().size(), 1 ); > QCOMPARE( m_model->listStatements( Node(), NAO::prefLabel(), Node() > ).allStatements().size(), 1 ); > > QVERIFY(!haveTrailingGraphs()); > } > > add to qtest_dms.cpp: > > model.addStatement( NCO::emailAddress(), RDF::type(), RDF::Property(), > graph ); > model.addStatement( NCO::emailAddress(), RDFS::range(), > XMLSchema::string(), graph ); > model.addStatement( NCO::emailAddress(), RDFS::domain(), > NCO::EmailAddress(), graph ); > > model.addStatement( NCO::hasEmailAddress(), RDF::type(), RDF::Property(), > graph ); > model.addStatement( NCO::hasEmailAddress(), RDFS::range(), > NCO::EmailAddress(), graph ); > model.addStatement( NCO::hasEmailAddress(), RDFS::domain(), > NCO::Contact(), graph ); > > model.addStatement( NCO::EmailAddress(), RDF::type(), RDFS::Resource(), > graph ); > model.addStatement( NCO::EmailAddress(), RDF::type(), RDFS::Class(), > graph > ); > model.addStatement( NCO::EmailAddress(), RDFS::subClassOf(), > NCO::ContactMedium(), graph ); > >> >> And here's an excerpt of the debugging output which shows the problem in the >> actual feeders: >> http://paste.kde.org/131377/ >> > > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:zre""<http://www.semanticdesktop.org/ontologies/2007/08/15/nao#prefLabel>"""Sebastian > > Trueg"" > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:zre""<http://www.w3.org/1999/02/22-rdf-syntax- > ns#type>""<http://www.semanticdesktop.org/ontologies/2007/03/22/nco#PersonContact>" > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:zre""<http://www.semanticdesktop.org/ontologies/2007/03/22/nco#fullname>"""Sebastian > > Trueg"^^<http://www.w3.org/2001/XMLSchema#string>" > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:zre""<http://www.semanticdesktop.org/ontologies/2007/03/22/nco#hasEmailAddress>""_:gqe" > > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:gqe""<http://www.w3.org/1999/02/22-rdf-syntax- > ns#type>""<http://www.semanticdesktop.org/ontologies/2007/03/22/nco#EmailAddress>" > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:gqe""<http://www.semanticdesktop.org/ontologies/2007/03/22/nco#emailAddress>"""[email protected]"^^<http://www.w3.org/2001/XMLSchema#string>" > > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:fqe""<http://www.semanticdesktop.org/ontologies/2007/08/15/nao#prefLabel>"""Sebastian > > Trueg"" > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:fqe""<http://www.w3.org/1999/02/22-rdf-syntax- > ns#type>""<http://www.semanticdesktop.org/ontologies/2007/03/22/nco#PersonContact>" > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:fqe""<http://www.semanticdesktop.org/ontologies/2007/03/22/nco#fullname>"""Sebastian > > Trueg"^^<http://www.w3.org/2001/XMLSchema#string>" > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: > "_:fqe""<http://www.semanticdesktop.org/ontologies/2007/03/22/nco#hasEmailAddress>""_:gqe" > > This is the error returned after the storeResourceCall: > nepomukstorage(21806)/nepomuk (storage service) > Nepomuk::DataManagementModel::storeResources: Setting error! "Invalid > argument > (1)": "http://www.semanticdesktop.org/ontologies/2007/03/22/nco#fullname has > a > max cardinality of 1. Provided 2 values - "Sebastian > Trueg"^^<http://www.w3.org/2001/XMLSchema#string>, "Sebastian > Trueg"^^<http://www.w3.org/2001/XMLSchema#string>. Existing - Affected > Resource: nepomuk:/res/75164167-3ae0-413f-a991-ed73a08ca9ec, new card: 2, old > card: 0" > "/opt/devel/KDE/bin/nepomukservicestub(21806)" Soprano: "Invalid argument > (1)": "http://www.semanticdesktop.org/ontologies/2007/03/22/nco#fullname has > a > max cardinality of 1. Provided 2 values - "Sebastian > Trueg"^^<http://www.w3.org/2001/XMLSchema#string>, "Sebastian > Trueg"^^<http://www.w3.org/2001/XMLSchema#string>. Existing - Affected > Resource: nepomuk:/res/75164167-3ae0-413f-a991-ed73a08ca9ec, new card: 2, old > card: 0" > >> As I understand your code you generate a hash of each resource to check if >> two are exactly the same. That probably works for most use-cases, but I'm >> not sure if it is the best solution. >> Given the problem above you'd have to rerun the hashing for the resources >> which were modified due to a merged resource, so that already complicates >> matters. >> >> I thought maybe it would be possible to leave the merging up to the normal >> resource merger. This would have the effect that not only exactly equal >> resources would be merged, but all, just as the resource merger would >> normally merge them. >> If you think of the SimpleResourceGraph as a tree, a post-order traversal of >> the tree would allow you to store each resource one by one, starting from >> the leaves of the branch going to the root. The ResourceMerger would then >> automatically merge all resources as necessary. >> >> Do you think that would be a viable option? >> >> Cheers, >> Christian >> >> _______________________________________________ >> Nepomuk mailing list >> [email protected] >> https://mail.kde.org/mailman/listinfo/nepomuk > _______________________________________________ > Nepomuk mailing list > [email protected] > https://mail.kde.org/mailman/listinfo/nepomuk > _______________________________________________ Nepomuk mailing list [email protected] https://mail.kde.org/mailman/listinfo/nepomuk
