[julia-users] Julia's text mining capabilities

2015-09-01 Thread Venkat Ramakrishnan
Folks, How fast is Julia compared to other languages like R and Python in text processing? Any benchmarks? Any parallel processing facility specifically available for text processing in Julia? Thanks, Venkat.

Re: [julia-users] Julia's text mining capabilities

2015-09-01 Thread Mauro
This may have answers: https://youtu.be/dgfIIZ5yA4E (though I haven't watched it yet) On Tue, 2015-09-01 at 14:42, Venkat Ramakrishnan wrote: > Folks, > > How fast is Julia compared to other languages like R and Python in text > processing? > Any benchmarks? > > Any parallel processing facility

Re: [julia-users] Julia's text mining capabilities

2015-09-02 Thread Venkat Ramakrishnan
Thanks Mauro. Unfortunately, the video doesn't mention any performance benchmarks that they considered as compared to other languages, before they selected Julia. Nor their website mentioned in the video. Anyone else have clues/pointers to information? Regards, Venkat. On Wednesday, 2 September

Re: [julia-users] Julia's text mining capabilities

2015-09-02 Thread Jonathan Malmaud
It really depends on what you mean by 'text processing' - if the critical section of your code is essentially a loop that iterates over characters/words of large strings, then Julia could be between 1 and 2 orders of magnitude faster. If you mean the performance of core string-processing libra

Re: [julia-users] Julia's text mining capabilities

2015-09-02 Thread Venkat Ramakrishnan
Dear Jonathan, That's great to hear that you are actively contributing to Julia's text processing capabilities. My use case has all of the text functions that you mentioned, so, thanks for the info. It looks like the nearest competitor is Python, and Julia's performance is see-sawing compared to

Re: [julia-users] Julia's text mining capabilities

2015-09-02 Thread Steven G. Johnson
On Wednesday, September 2, 2015 at 10:27:54 AM UTC-4, Venkat Ramakrishnan wrote: > > My use case has all of the text functions that you mentioned, so, thanks > for the info. It looks like the nearest competitor is Python, and Julia's > performance is see-sawing compared to Python in the regexp

Re: [julia-users] Julia's text mining capabilities

2015-09-02 Thread Steven G. Johnson
On Wednesday, September 2, 2015 at 12:19:33 PM UTC-4, Steven G. Johnson wrote: > > Note that regexps in Julia are implemented with the PCRE library. If you > google "PCRE vs Python" you'll find several comparisons. The upshot seems > to be that PCRE is about 2x slower than Python's regex imp

Re: [julia-users] Julia's text mining capabilities

2015-09-03 Thread Venkat Ramakrishnan
Thanks Steven. I guess since Julia projects itself as a better-performance language, it would be worthwhile to post the various performance benchmarks (text, math ops, i/o, etc) on its web-page on every major release as compared to its peer's current versions, if not done already which I am not aw

Re: [julia-users] Julia's text mining capabilities

2015-09-03 Thread Tim Holy
Most stuff gets done by people who need it. If this is important to you but doesn't exist, your best bet is to do it yourself and contribute it to the project. I bet there would be interest in adding your benchmarks somewhere accessible from the web page. Best, --Tim On Thursday, September 03,

Re: [julia-users] Julia's text mining capabilities

2015-09-03 Thread Venkat Ramakrishnan
No offense, but, the informal model of 'most stuff gets done by people who need it' leads to outdated information in external websites, as Steven had pointed out. I don't see why it would be any different if I do it (or) someone else does it in an external site and puts the link in the julia webpag

Re: [julia-users] Julia's text mining capabilities

2015-09-03 Thread Steven G. Johnson
There is a plan going forward to have more automated performance benchmarking linked to git, especially so that performance improvements (or regressions) can be quantified as they happen. The hardware has been ordered, but it will take a while to get this kind of automation up and running. Ho

Re: [julia-users] Julia's text mining capabilities

2015-09-03 Thread Venkat Ramakrishnan
Thanks much for the update Steven. I'll see if I can get recent data from PCRE. Thx, Venkat. On Fri, Sep 4, 2015 at 8:59 AM, Steven G. Johnson wrote: > There is a plan going forward to have more automated performance > benchmarking linked to git, especially so that performance improvements (or