Re: [julia-users] How to read and change the content of web pages to the vector ?

Avik Sengupta Thu, 05 Jun 2014 03:37:25 -0700

Second the suggestion for JSoup. Alternatively you could use BeautifulSoup. 
If you want the data in Julia in process, you can call either of these 
packages from Julia using JavaCall or PyCall respectively.


Regards
-
Avik

On Thursday, 5 June 2014 11:18:48 UTC+1, Yuuki Soho wrote:
>
> So, you want to parse a web page and get the content out of it.
>
> I'm not sure there's a very good way of doing it currently, because html 
> pages are often messy ( 
> http://programmers.stackexchange.com/questions/151739/getting-data-from-a-webpage-in-a-stable-and-efficient-way
>  
> ). What you want is some kind of html parser like jsoup. I don't any are 
> available in julia right know. You could try with an xml parser (
> https://github.com/lindahua/LightXML.jl) but I'm not sure it that will 
> work very well.
>
> Otherwise you can go take the dirty road and use regular expressions to 
> extract what you want.
>
>>

Re: [julia-users] How to read and change the content of web pages to the vector ?

Reply via email to