[Wikidata-bugs] [Maniphest] [Commented On] T215413: Image Classification Working Group
Ottomata added a comment. Not sure if this is relevant, but this seemed the best place to note. I just came across: https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_YARN It seems relatively easy to package up (e.g. on a notebook host) and ship to hdfs and then include it in a spark job. TASK DETAIL https://phabricator.wikimedia.org/T215413 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Miriam, Ottomata Cc: Ottomata, Jheald, Cirdan, MoritzMuehlenhoff, CDanis, akosiaris, SandraF_WMF, Fuzheado, PDrouin-WMF, Krenair, d.astrikov, JoeWalsh, Nirzar, dcausse, fgiunchedi, JAllemandou, leila, Capt_Swing, mpopov, Nuria, DarTar, Halfak, Gilles, EBernhardson, dr0ptp4kt, Harej, MusikAnimal, Abit, elukey, diego, Cparle, Ramsey-WMF, Miriam, Isaac, darthmon_wmde, Premeditated, Nandana, JKSTNK, Akovalyov, Lahi, Gq86, E1presidente, Anooprao, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Salgo60, Avner, Silverfish, _jensen, rosalieper, Susannaanas, Wong128hk, Jane023, terrrydactyl, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T215413: Image Classification Working Group
Fuzheado added a comment. FYI, some developments in the area of using image classification in the Wikiverse: We now have a Wikidata Distributed Game - Depicts that uses image classification ML to generate candidates. This was done as a project I did with The Met Museum and Microsoft. https://outreach.wikimedia.org/wiki/GLAM/Newsletter/January_2019/Contents/USA_reportTASK DETAILhttps://phabricator.wikimedia.org/T215413EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Miriam, FuzheadoCc: MoritzMuehlenhoff, CDanis, akosiaris, SandraF_WMF, Fuzheado, PDrouin-WMF, Krenair, d.astrikov, JoeWalsh, Nirzar, dcausse, fgiunchedi, JAllemandou, leila, Capt_Swing, mpopov, Nuria, DarTar, Halfak, Gilles, EBernhardson, dr0ptp4kt, Harej, MusikAnimal, Abit, elukey, diego, Cparle, Ramsey-WMF, Miriam, Isaac, Nandana, JKSTNK, Akovalyov, Lahi, Gq86, E1presidente, Anooprao, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Avner, Silverfish, _jensen, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T215413: Image Classification Working Group
Miriam added a comment. @Gilles thanks for this! Images and graphics have very different underlying image statistics: it is therefore fairly easy for a classifier to tell them a part. So it should be feasible. If we can collect some training data, by finding one or more categories in Commons with a substantial number of diverse graphics images, I can try to quickly build a graphics VS photo classifier, by finetuning an existing image classifier (it won't be perfect, but no GPU needed ;) ) @Isaac maybe your colleague can help with this, by sharing which categories and keywords he used to create his training data? As a side note, such a classifier can be helpful also to improve the accuracy of other image classifiers (e.g. object detectors or image quality classifiers), that are tipycally trained on photographic material and therefore fail completely when classifying non-photographic images. We did studies in the past to quantitavely explain the importance and the nature of the difference between graphics and images: https://www.dropbox.com/s/y97h8kjx84hbrzk/p242-redi.pdf?dl=0TASK DETAILhttps://phabricator.wikimedia.org/T215413EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MiriamCc: Mholloway, PDrouin-WMF, Krenair, d.astrikov, JoeWalsh, Nirzar, dcausse, fgiunchedi, JAllemandou, leila, Capt_Swing, mpopov, Nuria, DarTar, Halfak, Gilles, EBernhardson, dr0ptp4kt, Harej, MusikAnimal, Abit, elukey, diego, Cparle, Ramsey-WMF, Miriam, Isaac, Nandana, JKSTNK, Akovalyov, Lahi, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Avner, Silverfish, _jensen, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T215413: Image Classification Working Group
Isaac added a comment. If we go down that pathway of trying to identify what images are photographs, we should look into work by a former colleague of mine on detecting visualizations on Commons (in some ways, the inverse task): http://brenthecht.com/publications/www18_vizbywiki.pdf He (Allen Lin) might have some insight into some easy wins or pitfalls in building a model like that.TASK DETAILhttps://phabricator.wikimedia.org/T215413EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Miriam, IsaacCc: Mholloway, PDrouin-WMF, Krenair, d.astrikov, JoeWalsh, Nirzar, dcausse, fgiunchedi, JAllemandou, leila, Capt_Swing, mpopov, Nuria, DarTar, Halfak, Gilles, EBernhardson, dr0ptp4kt, Harej, MusikAnimal, Abit, elukey, diego, Cparle, Ramsey-WMF, Miriam, Isaac, Nandana, JKSTNK, Akovalyov, Lahi, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Avner, Silverfish, _jensen, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs