Hello Robert,

  I confirm that the version of the service is the same of the latest
binary distribution that you can download from [1] (0.7.0-incubating).

To be sure to programmatically extract the same data you obtain from the
live service you have to check that the Any23 extraction[2] is run with the
same parameters [3].
The service implementation is defined in [4]. In particular what generally
produces differences in the data layout is the Metadata Nesting flag [5]
wich connects
the graph forest[6] extracted from a page in a unique connected graph
representing the original HTML DOM nesting relationships of the forest.

Hope it helps.

The best.
Mic

[1] http://any23.apache.org/download.html
[2] org.apache.any23.Any23#extract
[3] org.apache.any23.extractor.ExtractionParameters
[4] org.apache.any23.servlet.WebResponder
[5] org.apache.any23.extractor.ExtractionParameters#METADATA_NESTING_FLAG
[6] http://en.wikipedia.org/wiki/Tree_(graph_theory)

On 2 October 2012 21:50, Robert Meusel <rob...@informatik.uni-mannheim.de>wrote:

> Below an example of the difference. I used the example document from the
> first mail:
>
> Any23.org
> ---------
>
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Recipe> .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/fn> "Receta de Tarta de naranja y
> chocolate" .
> _:node16687929411e1c5598e05ffddf69fac5 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:node16687929411e1c5598e05ffddf69fac5 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "Para el bizcocho
> (molde 22cm):" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:node16687929411e1c5598e05ffddf69fac5 .
> _:nodeda504a8bc7ff3ed52c337355cfc2f32 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:nodeda504a8bc7ff3ed52c337355cfc2f32 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "4 huevos" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:nodeda504a8bc7ff3ed52c337355cfc2f32 .
> _:node4cbc939e99bcbe4916b8052cce951f0 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:node4cbc939e99bcbe4916b8052cce951f0 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "120g
> az\u00C3\u00BAcar" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:node4cbc939e99bcbe4916b8052cce951f0 .
> _:node18b572d55f1f49c8742f5832f7ca1df7 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:node18b572d55f1f49c8742f5832f7ca1df7 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "80g harina" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:node18b572d55f1f49c8742f5832f7ca1df7 .
> _:node4f2936f11ef2132f52cf8a136c79c51 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:node4f2936f11ef2132f52cf8a136c79c51 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "20g harina de
> ma\u00C3\u00ADz" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:node4f2936f11ef2132f52cf8a136c79c51 .
> _:noded6a7db9359456aa3d6662954f5c2963 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:noded6a7db9359456aa3d6662954f5c2963 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "1cucharada cacao
> en polvo" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:noded6a7db9359456aa3d6662954f5c2963 .
> _:node54b5ecd17d4ddec85fa5364a8a7392 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:node54b5ecd17d4ddec85fa5364a8a7392 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "Para la tarta de
> naranja:" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:node54b5ecd17d4ddec85fa5364a8a7392 .
> _:nodec65a69f3455ceda8fcc2afa0f2d49c65 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:nodec65a69f3455ceda8fcc2afa0f2d49c65 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "2sobres gelatina
> de naranja" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:nodec65a69f3455ceda8fcc2afa0f2d49c65 .
> _:node6aee7ca9c9f4793ccfdb3e77f9afce <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:node6aee7ca9c9f4793ccfdb3e77f9afce <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "1 litro leche" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:node6aee7ca9c9f4793ccfdb3e77f9afce .
> _:noded3d876338d9752e298e643aaae7858 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:noded3d876338d9752e298e643aaae7858 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "Para la
> cobertura:" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:noded3d876338d9752e298e643aaae7858 .
> _:node783dfe412abb82751fb50ae23646ff <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:node783dfe412abb82751fb50ae23646ff <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "1trozo chocolate
> de cobertura" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:node783dfe412abb82751fb50ae23646ff .
> _:nodef57df9a870a7e5cc6ed67f3e7a8d623 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:nodef57df9a870a7e5cc6ed67f3e7a8d623 <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "1trozo
> mantequilla" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:nodef57df9a870a7e5cc6ed67f3e7a8d623 .
> _:node66ec8bf0ef1ec431142c21cc1a968f2e <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> .
> _:node66ec8bf0ef1ec431142c21cc1a968f2e <
> http://vocab.sindice.net/any23#hrecipe/ingredientName> "trozos de
> naranja" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/ingredient>
> _:node66ec8bf0ef1ec431142c21cc1a968f2e .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/yield> "Para 8 personas" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/instructions>
> "Preparaci\u00C3\u00B3n paso a paso    \nPreparar el bizcocho: Separamos
> las yemas de las claras y a\u00C3\u00B1adimos a las yemas el
> az\u00C3\u00BAcar. Montamos la mezcla con la batidora de varillas, y
> a\u00C3\u00B1adimos poco a poco, con ayuda de una esp\u00C3\u00A1tula, el
> cacao. Montamos las claras al punto de nieve y las a\u00C3\u00B1adimos con
> la mezcla anterior. Precalentamos el horno arriba y abajo a 200\u00C2\u00BA
> mientras echamos la mezcla en un molde engrasado. Una vez listo el horno,
> lo bajamos a 170\u00C2\u00BA y lo horneamos en posici\u00C3\u00B3n central
> durante 8 o 10 minutos. Luego meter a la nevera y dejar enfriar. \n
>  \nPara la tarta de naranja: poner al fuego medio litro de leche y cuando
> est\u00C3\u00A9 hiviendo a\u00C3\u00B1adir los 2 sobres de gelatina de
> naranja hasta que queden disueltos y despu\u00C3\u00A9s a\u00C3\u00B1adir
> los otros 500ml de leche fr\u00C3\u00ADa. Verter encima de la base de
> bizcocho.\n    \nPara la cobertura: Fundir el chocolate y echarlo sobre la
> capa anterior (Cuando est\u00C3\u00A9 un poco cuajado, para que no se
> mezcle). Y decorar encima del chocolate con trozos de naranja caramelizada.
> Dejar enfriar en la nevera y ya se puede servir." .
> _:noded72dc17d6916ba79e19876c5baf93e6a <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Duration> .
> _:noded72dc17d6916ba79e19876c5baf93e6a <
> http://vocab.sindice.net/any23#hrecipe/durationTime> "20-40 min" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/duration>
> _:noded72dc17d6916ba79e19876c5baf93e6a .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/photo> <
> http://any23.org/descargas/foto.aspx?id=10881&w=340&h=280> .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/photo> <
> http://any23.org/descargas/foto.aspx?id=10882&w=103&h=68> .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/photo> <
> http://any23.org/descargas/foto.aspx?id=10883&w=103&h=68> .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/photo> <
> http://any23.org/descargas/foto.aspx?id=10884&w=103&h=68> .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/photo> <
> http://any23.org/descargas/foto.aspx?id=3603&w=120&h=80> .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/author> "noelia21" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag> "/recetas/tags/batidora.aspx"
> .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag>
> "/recetas/tags/chocolate.aspx" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag> "/recetas/tags/dulces.aspx" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag> "/recetas/tags/exotica.aspx" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag> "/recetas/tags/horno.aspx" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag>
> "/recetas/tags/internacional.aspx" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag>
> "/recetas/tags/mediterranea.aspx" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag> "/recetas/tags/naranjas.aspx"
> .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag> "/recetas/tags/postres.aspx" .
> _:node8d8610c0d82b27c73ca7b13cb9dd17c4 <
> http://vocab.sindice.net/any23#hrecipe/tag> "/recetas/tags/tartas.aspx" .
> <http://any23.org/tmp/> <http://www.facebook.com/2008/fbmlapp_id>
> "203441256369548" .
> <http://any23.org/tmp/> <http://opengraphprotocol.org/schema/language>
> "es" .
> <http://any23.org/tmp/> <http://opengraphprotocol.org/schema/title>
> "Receta de Tarta de naranja y chocolate - Gallina Blanca" .
> <http://any23.org/tmp/> <http://opengraphprotocol.org/schema/url> "
> http://www.gallinablanca.es/receta/tarta-de-naranja-y-chocolate.aspx"; .
> <http://any23.org/tmp/> <http://opengraphprotocol.org/schema/image> "
> http://www.gallinablanca.es/descargas/foto.aspx?id=10881&w=340&h=280"; .
> <http://any23.org/tmp/> <http://opengraphprotocol.org/schema/description>
> "La receta de Tarta de naranja y chocolate se prepara con:  Para el
> bizcocho (molde 22cm):, 4 huevos, 120g az\u00C3\u00BAcar, 80g harina, 20g
> harina de ma\u00C3\u00ADz, 1cucharada cacao en polvo,  Para la tarta de
> naranja:, 2sobres gelatina de naranja, 1 litro leche,  Para la cobertura:,
> 1trozo chocolate de cobertura, 1trozo mantequilla,  trozos de naranja" .
> <http://any23.org/tmp/> <http://any23.org/tmp/publisher> <
> https://plus.google.com/105493319455787602116> .
> <http://any23.org/tmp/css/thickbox.css> <http://any23.org/tmp/stylesheet>
> <https://plus.google.com/105493319455787602116> .
> <http://any23.org/tmp/css/thickbox_ie.css> <
> http://any23.org/tmp/stylesheet> <
> https://plus.google.com/105493319455787602116> .
> <http://any23.org/tmp/css/style.css> <http://any23.org/tmp/stylesheet> <
> https://plus.google.com/105493319455787602116> .
> <http://any23.org/tmp/css/print.css> <http://any23.org/tmp/stylesheet> <
> https://plus.google.com/105493319455787602116> .
> <http://any23.org/tmp/> <http://any23.org/tmp/canonical> <
> http://www.gallinablanca.es/receta/tarta-de-naranja-y-chocolate.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/batidora.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/chocolate.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/dulces.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/exotica.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/horno.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/internacional.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/mediterranea.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/naranjas.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/postres.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/tag> <
> http://any23.org/tmp//recetas/tags/tartas.aspx> .
> <http://any23.org/tmp/> <http://any23.org/tmp/nofollow> <
> http://www.gallinablancastar.com> .
> <http://any23.org/tmp/> <http://any23.org/tmp/nofollow> <
> https://www.confianzaonline.es/empresas/gallinablanca.htm> .
> <http://any23.org/tmp/> <http://any23.org/tmp/nofollow> <
> http://www.calidalia.org> .
>
>
>
> Use Java Lib Directly
> ---------------------
>
> _:node291f9650fb7cb180dfe9f2a517da3d4 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Recipe> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node2db92a81f15e0e34a74f1938ee71661 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node6a8e73a8c7b7ae98c4de29d7f105de5 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node312ed52e474c88e73381ffb1cced7f <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node843d8e81527589708c293b86acfd6538 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:nodeb4701607270bb5b5f068a6ffd5d36 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node9d82df158631ba6b1b86e67eafc2f18 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node446bd1cf164674a42ec6be12e1176ea <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node55db47b314a4fe5d30f78def4f35e1 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:noded1566ddf4b7bde404d53c0b0f3ff880 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:nodead1a35d4c581857a9369c5239492625 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:noded1b19daab03c227b181f52f289b90db <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node611354e4ae2246391f952effa94a4aa <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:nodea3c6b44caad7f06a660231ee54e614 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Ingredient> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
> _:node412ea813c154eb362482233c5be4a0e4 <
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> http://vocab.sindice.net/any23#hrecipe/Duration> <http://www.test.de>
> <ex:html-mf-hrecipe>  .
>
> Thanks,
> Robert
>
> -----Original Message-----
> From: Robert Meusel [mailto:rob...@informatik.uni-mannheim.de]
> Sent: Dienstag, 2. Oktober 2012 21:14
> To: user@any23.apache.org
> Subject: Re: Irregularities with HRecipeExtractor
>
> Hi
>
> Any23.org extracts all included information from the document like links
> between recipe and ingedients, values of ingredients the whole tree. My
> Java code which uses the same lib version as any23.org returns just one
> single triple for each occurence of an incredient or recipe e.g. _node...
> #type ingredient. All other information are left out as the values of the
> nodes and the connection between recipe and ingredient.
>
>  Thanks for your help
>
>  Robert
>
>
>
> Lewis John Mcgibbney <lewis.mcgibb...@gmail.com> schrieb:
>
> >Hi Robert,
> >
> >2012/10/2 Robert Meusel <rob...@informatik.uni-mannheim.de>:
> >
> >> Does anybody know where this differents come from and how we can fix it?
> >
> >What are the differences?
> >
> >Thank you
> >Lewis
> >
>
>


-- 
Michele Mostarda
Senior Software Engineer
skype: michele.mostarda
twitter: micmos
mail: m...@michelemostarda.com
site : http://www.michelemostarda.com

Reply via email to