[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Rejected For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/DC-documentation/+merge/103728 -- https://code.launchpad.net/~diogo-simoes89/zorba/DC-documentation/+merge/103728 Your team Zorba Coders is requested to review the proposed merge of lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module
Diogo Simões has proposed merging lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module. Requested reviews: Zorba Coders (zorba-coders) For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/DC-documentation/+merge/103902 Addition of return types in functions signatures: Applied in conversion, consolidation and set-similarity modules. -- https://code.launchpad.net/~diogo-simoes89/zorba/DC-documentation/+merge/103902 Your team Zorba Coders is requested to review the proposed merge of lp:~diogo-simoes89/zorba/DC-documentation into lp:zorba/data-cleaning-module. === modified file 'src/com/zorba-xquery/www/modules/data-cleaning/consolidation.xq' --- src/com/zorba-xquery/www/modules/data-cleaning/consolidation.xq 2011-08-01 11:26:53 + +++ src/com/zorba-xquery/www/modules/data-cleaning/consolidation.xq 2012-04-27 15:23:42 + @@ -50,7 +50,7 @@ : @return The most frequent node in the input sequence. : @example test/Queries/data-cleaning/consolidation/most-frequent.xq :) -declare function con:most-frequent ( $s ) { +declare function con:most-frequent ( $s ) as item(){ (for $str in set:distinct($s) order by count($s[deep-equal(.,$str)]) descending return $str)[1] }; @@ -67,7 +67,7 @@ : @return The least frequent node in the input sequence. : @example test/Queries/data-cleaning/consolidation/leastfrequent_1.xq :) -declare function con:least-frequent ( $s ) { +declare function con:least-frequent ( $s ) as item(){ let $aux := for $str in set:distinct($s) order by count($s[deep-equal(.,$str)]) return $str return if (count($aux) = 0) then () else ($aux[1]) }; @@ -242,7 +242,7 @@ : @return The node having the largest number of descending elements in the input sequence. : @example test/Queries/data-cleaning/consolidation/most-elements.xq :) -declare function con:most-elements ( $s ) { +declare function con:most-elements ( $s ) as element(){ (for $str in set:distinct($s) order by count($str/descendant-or-self::element()) descending return $str)[1] }; @@ -260,7 +260,7 @@ : @return The node having the largest number of descending attributes in the input sequence. : @example test/Queries/data-cleaning/consolidation/most-attributes.xq :) -declare function con:most-attributes ( $s ) { +declare function con:most-attributes ( $s ) as element(){ (for $str in set:distinct($s) order by count($str/descendant-or-self::*/attribute()) descending return $str)[1] }; @@ -278,7 +278,7 @@ : @return The node having the largest number of descending nodes in the input sequence. : @example test/Queries/data-cleaning/consolidation/most-nodes.xq :) -declare function con:most-nodes ( $s ) { +declare function con:most-nodes ( $s ) as element(){ (for $str in set:distinct($s) order by count($str/descendant-or-self::node()) descending return $str)[1] }; @@ -296,7 +296,7 @@ : @return The node having the smallest number of descending elements in the input sequence. : @example test/Queries/data-cleaning/consolidation/least-elements.xq :) -declare function con:least-elements ( $s ) { +declare function con:least-elements ( $s ) as element(){ (for $str in set:distinct($s) order by count($str/descendant-or-self::element()) return $str)[1] }; @@ -314,7 +314,7 @@ : @return The node having the smallest number of descending attributes in the input sequence. : @example test/Queries/data-cleaning/consolidation/least-attributes.xq :) -declare function con:least-attributes ( $s ) { +declare function con:least-attributes ( $s ) as element(){ (for $str in set:distinct($s) order by count($str/descendant-or-self::*/attribute()) return $str)[1] }; @@ -332,7 +332,7 @@ : @return The node having the smallest number of descending nodes in the input sequence. : @example test/Queries/data-cleaning/consolidation/least-nodes.xq :) -declare function con:least-nodes ( $s ) { +declare function con:least-nodes ( $s ) as element(){ (for $str in set:distinct($s) order by count($str/descendant-or-self::node()) return $str)[1] }; @@ -350,7 +350,7 @@ : @return The node having the largest number of distinct descending elements in the input sequence. : @example test/Queries/data-cleaning/consolidation/most-distinct-elements.xq :) -declare function con:most-distinct-elements ( $s ) { +declare function con:most-distinct-elements ( $s ) as element(){ (for $str in set:distinct($s) order by count(set:distinct($str/descendant-or-self::element())) descending return $str)[1] }; @@ -368,7 +368,7 @@ : @return The node having the largest number of distinct descending attributes in the input sequence. : @example test/Queries/data-cleaning/consolidation/most-distinct-attributes.xq :) -declare function con:most-distinct-attributes ( $s ) { +declare function con:most-distinct-attributes ( $s ) as element(){ (for $str in set:distinct($s) order by count(set:distinct($str/descendant-or-self::*/attribute())) descending return $str)[1
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module has been updated. Status: Approved = Needs review For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683 -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683 Your team Zorba Coders is requested to review the proposed merge of lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683 -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683 Your team Zorba Coders is requested to review the proposed merge of lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module
Diogo Simões has proposed merging lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module. Requested reviews: Zorba Coders (zorba-coders) For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683 This revision includes a new normalization function: capitalize($string as xs:string) as xs:string. It also includes the thesaurus-based module, with the check-related ( $s1 as xs:string, $s2 as xs:string, $uri as xs:string, $type as xs:string ) and the related-terms ( $s1 as xs:string, $uri as xs:string, $type as xs:string ) functions. -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning-thesaurus/+merge/100683 Your team Zorba Coders is requested to review the proposed merge of lp:~diogo-simoes89/zorba/data-cleaning-thesaurus into lp:zorba/data-cleaning-module. === modified file 'src/com/zorba-xquery/www/modules/data-cleaning/normalization.xq' --- src/com/zorba-xquery/www/modules/data-cleaning/normalization.xq 2011-11-08 21:16:29 + +++ src/com/zorba-xquery/www/modules/data-cleaning/normalization.xq 2012-04-03 20:16:21 + @@ -31,12 +31,34 @@ module namespace normalization = http://www.zorba-xquery.com/modules/data-cleaning/normalization;; import module namespace http = http://www.zorba-xquery.com/modules/http-client;; +import module namespace ft = http://www.zorba-xquery.com/modules/full-text;; declare namespace ann = http://www.zorba-xquery.com/annotations;; declare namespace ver = http://www.zorba-xquery.com/options/versioning;; declare option ver:module-version 2.0; (:~ +: Converts a given string into a capitalized representation. +: +: @param $string The string to be capitalized. +: +: @return The string resulting from the conversion. +: @example test/Queries/data-cleaning/normalization/capitalize.xq +:) +declare function normalization:capitalize ($string as xs:string) as xs:string{ + let $ttokens := tokenize ($string, ) + let $cap-tokens := +for $toks in $ttokens[position()1] +let $capitalized-tokens := + if (not(ft:is-stop-word($toks))) + then concat(upper-case(substring($toks, 1,1)), substring(lower-case($toks), 2), ) + else concat(lower-case($toks), ) +return $capitalized-tokens + let $cap-string := concat(concat(upper-case(substring($ttokens[position()=1], 1,1)), substring(lower-case($ttokens[position()=1]), 2), ), string-join($cap-tokens)) + return substring($cap-string, 1, string-length($cap-string)-1) +}; + +(:~ : Converts a given string representation of a date value into a date representation valid according : to the corresponding XML Schema type. : === added file 'src/com/zorba-xquery/www/modules/data-cleaning/thesaurus-based.xq' --- src/com/zorba-xquery/www/modules/data-cleaning/thesaurus-based.xq 1970-01-01 00:00:00 + +++ src/com/zorba-xquery/www/modules/data-cleaning/thesaurus-based.xq 2012-04-03 20:16:21 + @@ -0,0 +1,74 @@ +(: + : Copyright 2006-2009 The FLWOR Foundation. + : + : Licensed under the Apache License, Version 2.0 (the License); + : you may not use this file except in compliance with the License. + : You may obtain a copy of the License at + : + : http://www.apache.org/licenses/LICENSE-2.0 + : + : Unless required by applicable law or agreed to in writing, software + : distributed under the License is distributed on an AS IS BASIS, + : WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + : See the License for the specific language governing permissions and + : limitations under the License. + :) + +(:~ + : This library module provides thesaurus functions for checking semantic relations between strings + : and for checking abbreviations. + + : These functions are particularly useful in tasks related to the creation of semantic mappings. + : + : + : @author Bruno Martins and Diogo Simões + :) + +module namespace thesaurus = http://www.zorba-xquery.com/modules/data-cleaning/thesaurus;; + +import module namespace ft = http://www.zorba-xquery.com/modules/full-text;; + +(:~ + : Checks if two strings have a relationship defined in a given thesaurus. + : The implementation of this function depends on the full-text module. + : + : + : @param $s1 The first string. + : @param $s2 The second string. + : @param $uri The uri of the thesaurus to be considered. + : @param $type An identifyer for the type of relationship. + : + : @return true if the first string has the provided relationship with the second string defined in the thesaurus and false otherwise. + : @example test/Queries/data-cleaning/thesaurus-based/check-related.xq + : + :) +declare function thesaurus:check-related ( $s1 as xs:string, $s2 as xs:string, $uri as xs:string, $type as xs:string ) as xs:boolean { + let $relation := ft:thesaurus-lookup( $uri, + $s2, + xs:language(en), + $type ) + return $relation = $s1
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC-conversion-tests into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/DC-conversion-tests into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/DC-conversion-tests/+merge/91599 -- https://code.launchpad.net/~diogo-simoes89/zorba/DC-conversion-tests/+merge/91599 Your team Zorba Coders is requested to review the proposed merge of lp:~diogo-simoes89/zorba/DC-conversion-tests into lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:zorba/data-cleaning-module into lp:~diogo-simoes89/zorba/DC
Diogo Simões has proposed merging lp:zorba/data-cleaning-module into lp:~diogo-simoes89/zorba/DC. Requested reviews: Diogo Simões (diogo-simoes89) For more details, see: https://code.launchpad.net/~zorba-coders/zorba/data-cleaning-module/+merge/91121 Changes in the tests of conversion module: - address-from-phone - address-from-user - phone-from-address - phone-from-user - user-from-address - user-from-phone These changes support variations of the webservices results -- https://code.launchpad.net/~zorba-coders/zorba/data-cleaning-module/+merge/91121 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:zorba/data-cleaning-module into lp:~diogo-simoes89/zorba/DC
The proposal to merge lp:zorba/data-cleaning-module into lp:~diogo-simoes89/zorba/DC has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~zorba-coders/zorba/data-cleaning-module/+merge/91121 -- https://code.launchpad.net/~zorba-coders/zorba/data-cleaning-module/+merge/91121 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124 -- https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124 Your team Zorba Coders is requested to review the proposed merge of lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
Re: [Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module
Thanks Chris. Guess it is done now To: mp+91...@code.launchpad.net From: chillery+launch...@lambda.nu Subject: Re: [Merge] lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module Date: Wed, 1 Feb 2012 17:18:20 + Diogo, you need to set the commit message for the merge proposal in order for the validation queue to run. -- https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124 You are the owner of lp:~diogo-simoes89/zorba/DC. -- https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
Re: [Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/DC into lp:zorba/data-cleaning-module
Review: Approve Changes in the tests of conversion module: - address-from-phone - address-from-user - phone-from-address - phone-from-user - user-from-address - user-from-phone These changes support variations of the webservices results -- https://code.launchpad.net/~diogo-simoes89/zorba/DC/+merge/91124 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Merge] lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module
The proposal to merge lp:~diogo-simoes89/zorba/data-cleaning into lp:zorba/data-cleaning-module has been updated. Status: Needs review = Approved For more details, see: https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 -- https://code.launchpad.net/~diogo-simoes89/zorba/data-cleaning/+merge/79530 Your team Zorba Coders is subscribed to branch lp:zorba/data-cleaning-module. -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp