Hi Tyler,

the extracted URIs are based on Infobox mappings and the data of the
Wikipedia. The URIs you listed are identical to the ones in Wikipedia.
Our framework is quite tolerant regarding valid URIs as it is supposed
to represent the original Wikipedia data.
We could easily use a stricter URI validation, but don't want to throw
away data which might be useful to others.

Cheers,
Max

On Sat, Jun 19, 2010 at 12:24 AM, R. Tyler Ballance <ty...@monkeypox.org> wrote:
> I'm working with 3.5.1, and I've noticed that mappingbased_proopeties_en.nt,
> compared to the other sets that I've worked with is *full* of errors that
> break my imports in funky ways.
>
> There are a number of non-absolute URLs:
>
>    ERROR: Malformed document: Not a valid (absolute) URI: 
> www.newfreedomboro.org/index2.htm [line 600646]
>    ERROR: Malformed document: Not a valid (absolute) URI: www.rubenblades.com 
> [line 975491]
>    ERROR: Malformed document: Not a valid (absolute) URI: Fansite [line 
> 1056096]
>    ERROR: Malformed document: Not a valid (absolute) URI: None [line 278162]
>
> (Just as a couple examples)
>
> As a matter of practice, I've been just dropping malformed entites from the
> file but I'm wondering if there's anything I can do to track down the errors 
> to
> help improve the next release?
>
> Would filing a ticket with a unified diff of the 3.5.1
> mappingbased_proopeties_en.nt file compared to my modified one be helpful?
>
>
> Cheers,
> -R. Tyler Ballance
> --------------------------------------
>  Jabber: rty...@jabber.org
>  GitHub: http://github.com/rtyler
> Identica: http://identi.ca/dero
>  Twitter: http://twitter.com/agentdero
>    Blog: http://unethicalblogger.com
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>

------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to