[ 
https://issues.apache.org/jira/browse/STANBOL-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828641#comment-13828641
 ] 

Rupert Westenthaler commented on STANBOL-1214:
----------------------------------------------

Thanks  Viktor for the patch. I will download a new dump and test the updated 
script. Most likely I will keep both as Google might decide to switch back to 
the old format.

> Fix for fbranking.sh script
> ---------------------------
>
>                 Key: STANBOL-1214
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1214
>             Project: Stanbol
>          Issue Type: Bug
>            Reporter: Viktor Gal
>
> the format of freebase dump has been changed. now they contain full URIs 
> hence the fbranking.sh for counting incoming links is obsolete. Here's a 
> quick fix for the new dump format:
> gunzip -c db/freebase-rdf-2013-11-17.gz \ 
> | grep 
> "^<http://rdf.freebase.com/ns/m\..*<.*>.*<http://rdf.freebase.com/ns/m\."; \
> | cut -f 3 \
> | sed 's/.*\/ns\/\(.*\)>/\1/g \
> | sort -S $MAX_SORT_MEM \
> | uniq -c  \
> | sort -nr -S $MAX_SORT_MEM > $INCOMING_FILE



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to