Hi Richard

Thanks for the email.

Aghh Re spelling. My mistake sorry. But alas... it still throws an
exception. I've tested a range of websites.  I'm running Windows 10,
TBCME 6.2.2.

When the URL
http://localhost:8083/tbl/lib/generaltesting/localFiles/fire.html  is
viewed in the browser I get the below message. the folder generaltesting is
the main project folder.

[image: TopBraid Suite Console]
<http://localhost:8083/tbl/lib/generaltesting/localFiles/>
<http://localhost:8083/tbl/lib/generaltesting/localFiles/admin>
Error
 Administrator
<http://localhost:8083/tbl/lib/generaltesting/localFiles/pages/userName>
(logout) <http://localhost:8083/tbl/lib/generaltesting/localFiles/purgeuser>

An error has been reported:

*No folder found with alias "generaltesting".*
© 2009-2019 TopQuadrant, Inc. All Rights Reserved Version:
6.2.2.v20190507-2132R Send Error Log to TQ
<http://localhost:8083/tbl/lib/generaltesting/localFiles/pages/sendErrorLog>
 Support
<http://localhost:8083/tbl/lib/generaltesting/localFiles/pages/supportInfo>

 I did try to find other folders/files on the server, guessing a few from
TBCME folders, but nothing other than http://localhost:8083/tbl/ works. Is
the lib folder name right ? Is there a way to list/explore directories in
the Jetty server somehow ?  I see there is for jetty setups generally but
wasn't sure how to do so within TBCME.

[image: image.png]

On Tue, Nov 19, 2019 at 9:20 PM Richard Cyganiak <rich...@topquadrant.com>
wrote:

> Hi Simon,
>
> Just checking, does the URL work if you access it in a web browser? If
> not, that needs to be fixed first.
>
> I note that you said "localFiles" in one place and "localFIles" in
> another, with a different capitalisation of the "i". Depending on your
> operating system, this may be a problem.
>
> Richard
>
>
> On 17 Nov 2019, at 20:02, Simon Opper <simon.op...@surroundaustralia.com>
> wrote:
>
> Hi Holger
>
> Thanks for the email.
>
> I couldn't get your suggestion to work.
>
> I tried:
>
> creating folder (under a project folder) e.g. generaltesting\localFiles.www
> and placed the html file there.
>
> In a xhtml sml module I used
>
> :ImportXHTML_1
>   a sml:ImportXHTML ;
>   sm:nodeX 18 ;
>   sm:nodeY 151 ;
>   sm:outputVariable "xml" ;
>   sml:url "
> http://localhost:8083/tbl/lib/generaltesting/localFIles/fire.html"; ;
>   rdfs:label "Import XHTML 1" ;
>   skos:prefLabel "Import XHTML 1" ;
>
>   If I try open that file via the URL in a TBCME browser it can't find it
> and it says folder not found..
>
> when I run the module I get the following error
>
> java.lang.reflect.InvocationTargetException
> at
> org.topbraidcomposer.sparqlmotion.actions.AbstractExecuteSPARQLMotionAction$1.run(AbstractExecuteSPARQLMotionAction.java:160)
> at org.topbraidcomposer.core.util.ThreadUtil$1$1.run(ThreadUtil.java:66)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.topbraid.spin.sparqlmotion.modules.SMException: Failed to
> load HTML from
> http://localhost:8083/tbl/lib/generaltesting/localFIles/fire.html
> at
> org.topbraid.sparqlmotion.lib.internal.ImportXHTMLModule.execute(ImportXHTMLModule.java:42)
> at
> org.topbraid.spin.sparqlmotion.engine.impl.ExecutionEngineImpl.execute(ExecutionEngineImpl.java:202)
> at
> org.topbraid.spin.sparqlmotion.engine.impl.ExecutionEngineImpl.executeModule(ExecutionEngineImpl.java:168)
> at
> org.topbraid.spin.sparqlmotion.engine.impl.ExecutionEngineImpl.execute(ExecutionEngineImpl.java:118)
> at
> org.topbraidcomposer.sparqlmotion.views.console.SPARQLMotionConsole.execute(SPARQLMotionConsole.java:79)
> at
> org.topbraidcomposer.sparqlmotion.actions.AbstractExecuteSPARQLMotionAction$1.run(AbstractExecuteSPARQLMotionAction.java:149)
> ... 2 more
> Caused by: org.jsoup.HttpStatusException: HTTP error fetching URL.
> Status=404, URL=
> http://localhost:8083/tbl/lib/generaltesting/localFIles/fire.html
> at
> org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:682)
> at
> org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:629)
> at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:261)
> at org.jsoup.helper.HttpConnection.get(HttpConnection.java:250)
> at org.topbraid.html2xml.HTML2XML.parseFromURL(HTML2XML.java:28)
> at
> org.topbraid.sparqlmotion.lib.internal.ImportXHTMLModule.execute(ImportXHTMLModule.java:37)
> ... 7 more
>
>
> On Thu, Nov 14, 2019 at 12:45 PM Holger Knublauch <hol...@topquadrant.com>
> wrote:
>
>> Hi Simon,
>> On 14/11/2019 06:52, Simon Opper wrote:
>>
>> Hi folks
>>
>> I'm having an issue importing/converting a local html file using sparql
>> motion scripts as opposed to a web file at a URL. A web html file works
>> fine in my tests.
>>
>> I can use the importXHTML module on a web URL fine  e.g.
>> wwww.examplesite.com/htmlfile
>> But if I try point it at a local file it fails. I've tried the following
>> http protocol without success. e.g. file:///fileLocation/htmlfile
>>  (specifiying .html file type makes no difference and also with or without
>> .html type added on disc). I also tried
>> file://localhost/fileLocation/htmlfile with no success.
>>
>> I also tried converting the html file to xhtml using oxgenXML but this
>> made no change.
>>
>> Are you referring to this error?
>>
>> Caused by: java.net.MalformedURLException: Only http & https protocols
>> supported
>>     at
>> org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:636)
>>     at
>> org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:629)
>>     at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:261)
>>     at org.jsoup.helper.HttpConnection.get(HttpConnection.java:250)
>>     at org.topbraid.html2xml.HTML2XML.parseFromURL(HTML2XML.java:28)
>>     at
>> org.topbraid.sparqlmotion.lib.internal.ImportXHTMLModule.execute(ImportXHTMLModule.java:37)
>>     ... 7 more
>>
>>
>> is there some aspect of tidy function or something else at play that
>> either I'm missing or can't control for a local file ?
>>
>> I guess we could switch to this if the URL is a local file:
>>
>> https://jsoup.org/cookbook/input/load-document-from-file
>>
>> Would this solve your use case? (There still is time for the 6.3 final
>> release).
>>
>> Given that the current version only support HTTP, could you use the
>> EDG/TBL server to access the files? For example, with TBC-ME:
>>
>> 1) Create a folder in the workspace such as myfiles.www
>>
>> 2) Copy your .html file(s) into that folder, e.g. hk.html
>>
>> 3) Use sml:url http://localhost:8083/tbl/lib/myfiles/hk.html
>>
>> In my quick test that worked fine.
>>
>>
>> As a related question on debugging this. Is it possible to see more info
>> anywhere about these modules other than the basic info in TBCME help and at
>> the SPIN vocab files which are only of limited help ? e.g. more details on
>> the underlying classes and signatures ?
>>
>> Not that I could think of. The stack traces should you some of the
>> internals, but only if something goes wrong.
>>
>> Maybe the rest of the email can be ignored if the solution above works
>> for you?
>>
>> Holger
>>
>>
>>
>>
>> I then tried using another route such as the convertXMLtoRDF module.  The
>> usage note for the module says that the smlxmlType can be set to XHTML so
>> that it "treats input as html source". see ref below.
>>
>>
>> " sml:xml: The XML document that shall be converted to RDF. To avoid
>> character encoding issues, we strongly recommend this value to be a
>> reference to an already parsed XML document, and not a literal. In other
>> words, use "Add SPARQL expression" from the drop down menu and enter
>> ?varName and do not use a string value such as {?varName}. The actual
>> document parsing should be handled by predecessing modules such as
>> sml:ImportXMLFromURL.
>>
>>
>> sml:xmlType (xsd:string): [Optional] An (optional) type indicator for the
>> Semantic XML conversion. Current supported values are "XHTML" (treats the
>> input as HTML source, and may run a tidy algorithm in case the HTML is not
>> well-formed XHTML).  "
>>
>> I experimented with a few ways of processing the html to xml such as
>> importTextFile and importXMLfile but I asssume because the html is not
>> valid xml this doesn't work.
>>
>> e.g.
>>
>> warnings:ImportTextFile_2
>>   a sml:ImportTextFile ;
>>   sm:next warnings:Convert_html_XMLToRDF_2 ;
>>   sm:nodeX 617 ;
>>   sm:nodeY 39 ;
>>   sm:outputVariable "textOut" ;
>>   sml:sourceFilePath "mfu@id=4851.txt" <mfu@id=4851.txt> ;
>>   rdfs:label "Import text file xml test" ;
>> .
>>
>> # the xmlToRDF below fails. A character encoding issue by the looks.
>> exception message Caused by:
>> org.topbraid.spin.sparqlmotion.modules.SMException:
>> org.xml.sax.SAXParseException; lineNumber: 9; columnNumber: 43; The
>> reference to entity "l" must end with the ';' delimiter.
>>
>> warnings:Convert_html_XMLToRDF_2
>>   a sml:ConvertXMLToRDF ;
>>   sm:nodeX 601 ;
>>   sm:nodeY 272 ;
>>   sml:baseURI "www.example2.com" ;
>>   sml:xml [
>>       sp:varName "textOut" ;
>>     ] ;
>>   sml:xmlType "xhtml" ;
>>   rdfs:label "Convert html XMLTo RDF 2" ;
>> .
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "TopBraid Suite Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to topbraid-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/topbraid-users/17f9a123-98b7-49e2-bc67-11524e0e1911%40googlegroups.com
>> <https://groups.google.com/d/msgid/topbraid-users/17f9a123-98b7-49e2-bc67-11524e0e1911%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "TopBraid Suite Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to topbraid-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/topbraid-users/25552c97-8cec-e814-3130-f775ee6e9f7f%40topquadrant.com
>> <https://groups.google.com/d/msgid/topbraid-users/25552c97-8cec-e814-3130-f775ee6e9f7f%40topquadrant.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> --
> ____________________________________________________________
> __________________________
>
> Many Thanks
> *Simon Opper*
> Chief Data Scientist at SURROUND Australia Pty Ltd
> Address  P.O. Box 86, Mawson, Canberra ACT 2607
> Phone     +61 477  <++61+477+560+177>*641 837*
> Email       simon.op...@surroundaustralia.com
> <nicholas....@surroundaustralia.com>Website
> https://www.surroundaustralia.com
>
> *Enhancing Intelligence Within Organisations*
> *delivering evidence that connects decisions to outcomes*
>
> --
> You received this message because you are subscribed to the Google Groups
> "TopBraid Suite Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to topbraid-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/topbraid-users/CABfSiROer3iexv8THFsegNsEYSjhrHiCRb1ws2-38ueEKgyocw%40mail.gmail.com
> <https://groups.google.com/d/msgid/topbraid-users/CABfSiROer3iexv8THFsegNsEYSjhrHiCRb1ws2-38ueEKgyocw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "TopBraid Suite Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to topbraid-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/topbraid-users/79A4AEA1-7AD6-46B3-9206-47186E6B22A2%40topquadrant.com
> <https://groups.google.com/d/msgid/topbraid-users/79A4AEA1-7AD6-46B3-9206-47186E6B22A2%40topquadrant.com?utm_medium=email&utm_source=footer>
> .
>


-- 
____________________________________________________________
__________________________

Many Thanks
*Simon Opper*
Chief Data Scientist at SURROUND Australia Pty Ltd
Address  P.O. Box 86, Mawson, Canberra ACT 2607
Phone     +61 477  <++61+477+560+177>*641 837*
Email       simon.op...@surroundaustralia.com
<nicholas....@surroundaustralia.com>Website
https://www.surroundaustralia.com

*Enhancing Intelligence Within Organisations*
*delivering evidence that connects decisions to outcomes*

-- 
You received this message because you are subscribed to the Google Groups 
"TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to topbraid-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/topbraid-users/CABfSiRPUwUOSiHRHdCMJg1K2oT8y7E3Lnn5HvaHvkF0FEtPo-A%40mail.gmail.com.

Reply via email to