You should use an IDE for this,it will make you life a lot easier ;)
I use the intelliJ IDEA default debugger and works pretty good. I could
send you instructions to set it up

Best,
Dimtiris


On Tue, Apr 23, 2013 at 3:59 PM, Julien Plu <
julien....@redaction-developpez.com> wrote:

> No I don't have a debugger because I'm coding on a remote machine via ssh.
>
> And even with this code :
>
>
> override def extract(page: PageNode, subjectUri: String, pageContext:
> PageContext): Seq[Quad] = {
>      if (page.title.namespace != Namespace.Template || page.isRedirect ||
> !page.title.decoded.contains("évolution population")) return Seq.empty
>
>     for (property <- findPropertyNodes(page)) {
>         println(property.toWikiText)
>     }
> }
> private def findPropertyNodes(node : Node) : List[PropertyNode] = {
>
>     node match {
>         case propertyNode : PropertyNode => List(propertyNode)
>         case _ = node.children.flatMap(findPropertyNodes)
> }
>
> Absolutely nothing is displayed, because the list returned by
> "findPropertyNodes" is empty and I don't understand why. I know she's empty
> because if I do that :
>
> if (findPropertyNodes(page).isEmpty) {
>     println("empty")
> }
> else {
>     println("no empty")
> }
>
> And "empty" is displayed whereas if I display "page.children" I have all
> the template code but the "findPropertyNodes" function doesn't find
> property inside this template code :-(
>
> Best.
>
> Julien.
>
>
>
> 2013/4/23 Jona Christopher Sahnwaldt <j...@sahnwaldt.de>
>
>> On 23 April 2013 12:01, Julien Plu <julien....@redaction-developpez.com>
>> wrote:
>> > Sorry but I really don't understand how AST works (and Scala too) I try
>> to
>> > retrieve all the PropertyNode contained in a PageNode so I do :
>> >
>> >
>> > override def extract(page: PageNode, subjectUri: String, pageContext:
>> > PageContext): Seq[Quad] = {
>> >     if (page.title.namespace != Namespace.Template || page.isRedirect ||
>> > !page.title.decoded.contains("évolution population")) return Seq.empty
>> >
>>
>> I think it would be good if you could get a picture of the structure
>> of the tree. It's usually not complicated, but a bit hard to explain
>> in text. Can you use a debugger? If so, set a breakpoint at the
>> following line and let the debugger show the page variable. Then click
>> into it, look at its children, and so on.
>>
>> We should add a toString() method to Node.scala (and some sub-classes)
>> that shows the structure.
>>
>> >     for (node <- page.children) {
>> >         for (property <- allPropertiesNode(node)) {
>> >             println(property.toWikiText)
>> >         }
>> >     }
>> > }
>> >
>> > private def allPropertiesNode(node : Node) : List[PropertyNode] = {
>> >     node match {
>> >         case propertyNode : PropertyNode => List(propertyNode)
>> >         case _ = node.children
>> >    }
>>
>> This is almost right. If I understand correctly, you want to walk
>> through the whole tree and collect all property nodes. Change this
>> line:
>>
>>     case _ = node.children
>>
>> (does that even compile? I don't understand how... :-) ) to
>>
>>     case _ => node.children.flatMap(allPropertiesNode)
>>
>> (I think that should work, I'm not 100% sure.)
>>
>> Oh by the way, the method name should be allPropertyNodes. :-) Or
>> maybe findPropertyNodes is even better.
>>
>> Once the method works, you can drop the main loop in extract(). Instead of
>>
>> for (node <- page.children) {
>>     for (property <- allPropertiesNode(node)) {
>>         println(property.toWikiText)
>>     }
>> }
>>
>> you can just write
>>
>> for (property <- findPropertyNodes(page)) {
>>     println(property.toWikiText)
>> }
>>
>> But that's just cosmetic surgery, it has the same effect.
>>
>> Cheers,
>> JC
>>
>> > }
>> >
>> >
>> > And nothing is displayed on my screen :-(
>> >
>> > Any idea of what I do wrongly ?
>> >
>> > BesT.
>> >
>> > Julien.
>> >
>> >
>> > 2013/4/23 Julien Plu <julien....@redaction-developpez.com>
>> >>
>> >> Hi,
>> >>
>> >> param come from a bad copy paste, it's "pop" the good variable.
>> >>
>> >> By the way thank you for the hint about AST I will take a look at these
>> >> class and see how I can use them. I won't hesitate to ask if I'm
>> blocked :-)
>> >>
>> >> Best.
>> >>
>> >> Julien.
>> >>
>> >>
>> >> 2013/4/22 Jona Christopher Sahnwaldt <j...@sahnwaldt.de>
>> >>>
>> >>> Hi Julien,
>> >>>
>> >>> On 22 April 2013 21:43, Julien Plu <
>> julien....@redaction-developpez.com>
>> >>> wrote:
>> >>> > I started the code for the extractor and I have a problem with the
>> >>> > regex in
>> >>> > Scala. the string is :
>> >>> >
>> >>> >
>> http://fr.wikipedia.org/w/index.php?title=Mod%C3%A8le:Donn%C3%A9es/Antony/%C3%A9volution_population&action=edit
>> >>> >
>> >>> > And my regex is : val populationRegex = """|pop=(\d+)""".r
>> >>> >
>> >>> > And I use this piece of code :
>> >>> >
>> >>> > populationRegex findAllIn  page.children.toString foreach (_ match {
>> >>> >     case populationRegex (pop) => println(page.title.decoded + " :
>> pop
>> >>> > : " +
>> >>> > param)
>> >>>
>> >>> What is param?
>> >>>
>> >>> But more generally - did you try using the AST (abstract syntax tree)
>> >>> built by the parser, i.e. the tree whose root node is the PageNode?
>> >>> I'm not sure how good our parser is at dealing with stuff like
>> >>> "<includeonly>" and "{{#switch ...}}", but I think it works and
>> >>> page.children should contain a ParserFunctionNode [1] object for the
>> >>> #switch, which in turn has a child for each branch, e.g. one child for
>> >>> an=2010 and one for pop=61793. These children are PropertyNode [2]
>> >>> objects, which have a key and (who would have thought) more children.
>> >>> Well, in this case, just one child, which is a TextNode. In a
>> >>> nutshell: Find the "#switch" node, find children with keys "an" and
>> >>> "pop", and generate triples for their values.
>> >>>
>> >>> >     case _ =>
>> >>> > })
>> >>> >
>> >>> > And instead of to get : "Données/Antony/évolution population : pop :
>> >>> > 61793"
>> >>> > just once
>> >>> >
>> >>> > I have many : "Données/Antony/évolution population : pop : null" as
>> >>> > much as
>> >>> > there is line in the string
>> >>> >
>> >>> > An idea of what I do wrongly ?
>> >>> >
>> >>> > I'm totally beginner in Scala :-( sorry.
>> >>>
>> >>> Your code excerpt looks pretty good to me. :-)
>> >>>
>> >>> The AST is usually much safer and cleaner than regexes. Regexes are
>> >>> more suitable for unstructured strings, but here you're dealing with
>> >>> pretty clean structures. So I would suggest you write some code that
>> >>> walks through the PageNode tree. If you have any questions, don't
>> >>> hesitate to ask. We're looking forward to your contributions. Thanks!
>> >>>
>> >>> Cheers,
>> >>> JC
>> >>>
>> >>> [1]
>> >>>
>> https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/wikiparser/ParserFunctionNode.scala
>> >>> [2]
>> >>>
>> https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/wikiparser/PropertyNode.scala
>> >>>
>> >>> >
>> >>> > Best.
>> >>> >
>> >>> > Julien.
>> >>> >
>> >>> >
>> >>> > 2013/4/22 Jona Christopher Sahnwaldt <j...@sahnwaldt.de>
>> >>> >>
>> >>> >> The templates where data is stored are not used directly in the
>> main
>> >>> >> pages. It's a complicated process: page Toulouse uses template X, X
>> >>> >> uses Y,
>> >>> >> Y uses Z, and Z contains the data. Something like that, I'm 100%
>> sure,
>> >>> >> but
>> >>> >> the details don't matter. This means that wikiPageUsesTemplate and
>> >>> >> InfoboxExtractor won't help.
>> >>> >>
>> >>> >> Generating a separate file is probably the best idea. We could also
>> >>> >> send
>> >>> >> these new triples to the main mapping based file, but that might be
>> >>> >> confusing: first, they're not mapping based; second, new triples
>> about
>> >>> >> a
>> >>> >> city would be added in a completely different place in the file.
>> >>> >> (That's not
>> >>> >> a big problem though.)
>> >>> >>
>> >>> >> Cheers,
>> >>> >> JC
>> >>> >
>> >>> >
>> >>
>> >>
>> >
>>
>
>


-- 
Kontokostas Dimitris
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to