Hi Amit,

thanks for posting your question :)
The rule you mention defines a key to be valid when is not a plain number
(i.e. it does not have only digits) - i.e. they are explicit.
This because templates can have either explicit or implicit parameters:
- an explicit parameter has a name
- implicit parameters are identified by their position, so they have no
name but only an index

E.g.
{{Template|name=...|surname=...}} => properties are { "name" => ...;
"surname" => ...}
and
{{Template|...|...}} => properties are { "1" => ...; "2" => ...}

The MinPercentageOfExplicitPropertyKeys is used to skip useless templates.

The real problem with the page you mention is not the percentage we are
using, but how the template is filled in with data:

{{Infobox company
| name      =  International Speedway Corporation|
| logo      =  [[Image:Iscmotorsportslogo.png]]
| type      =  [[Public company|Public]]  |
| traded_as  = {{NASDAQ|ISCA}}<br />{{OTCQB|ISCB}}
| foundation        =  1953 (as Bill France Racing, Inc.)|
| location          =  1 Daytona Boulevard<br />[[Daytona Beach, Florida]]
 32114-1243|
| key_people        =  [[Bill France, Sr.]], founder<br/>[[Jim France]],
CEO<br/>[[Lesa Kennedy]], president|
| industry          =  [[Auto racing|Motorsports]]|
| products          =  Sporting events|
| revenue           =  {{decrease}} $633.91 million [[United States
dollar|USD]] (2010, November)|
| operating_income  =  {{decrease}} $115.64 million [[United States
dollar|USD]] (2010, November)|
| net_income        =  {{decrease}} $54.53 million [[United States
dollar|USD]] (2010, November)|
| num_employees     =   1,000 (full time) |
| homepage          =  [http://www.iscmotorsports.com/
www.iscmotorsports.com]|
}}

There is a bunch of useless misleading trailing pipes ("|") in the template
properties.
The effect is that the parser thinks there is a number of implicit
parameters which will be counted in the list of params (hence the template
is below the threshold of 75%).
Can you fix the wikipedia article?

More general question: which are the allowed chars in a implicit template
param?

Cheers
Andrea


2013/12/19 Amit Kumar <[email protected]>

> Hi,
> Today while looking at the extracted dataset we found we are not getting
> any infobox properties output for some pages.
> For example if you try for
> http://en.wikipedia.org/wiki/International_Speedway_Corporation
>
> Debugging told me that the problem lies in the Infobox Extractor
>
> val MinPercentageOfExplicitPropertyKeys = 0.75
> Š
>
> val countExplicitPropertyKeys = propertyList.count(property =>
> !property.key.forall(_.isDigit))
> if ((countExplicitPropertyKeys >= MinPropertyCount) &&
> (countExplicitPropertyKeys.toDouble / propertyList.size) >
> MinPercentageOfExplicitPropertyKeys)
> {
> ..
> ..
> }
>
> What is I think it says, is that we should only parse templates where it
> finds minimum 75% of Keys in the (key,value) to be valid keys. The above
> mentioned wiki page doesn't makes the cut. Can someone tell the about this
>  75% cut off. I tried with 50% limit it gives the desired output ? I know
> lowering it will start giving more data some of which might be bad quality.
>
>
>
> Regards
> Amit
>
>
>
> ------------------------------------------------------------------------------
> Rapidly troubleshoot problems before they affect your business. Most IT
> organizations don't have a clear picture of how application performance
> affects their revenue. With AppDynamics, you get 100% visibility into your
> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics
> Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
> _______________________________________________
> Dbpedia-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to