combiner function not called for r/fold with single partition?

2015-08-16 Thread Ron Toland
In my local Clojure Users Group last week, we encountered an odd behavior 
with r/fold.

Specifically, it seems like the three-arity version (r/fold combiner-fn 
reducer-fn coll) doesn't call the combiner-fn if the coll has fewer 
elements than the default partition-size (512).

This leads to some surprising behavior. For instance, Example 8 in this 
(written by someone else, and otherwise very nice) set of reducer examples 
will not calculate the average-age as desired, since the combiner-fn will 
not be called for a collection of just 100 
elements: https://gist.github.com/ianrumford/658

I realize that for a combiner-fn with just zero-arity or two-arity versions 
defined, calling it on a single partition would not make sense. Perhaps 
r/fold should be revised so that if a combiner-fn is provided that has a 
single-arity version defined, it still gets called even if there is just 
one partition? 

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Read Microsoft Word .doc files in Clojure

2014-12-05 Thread Ron Toland
Divya,

Here's a simple example for converting text from an input stream (which you can 
convert any file into):

(ns sample.tika
  (:require [clj-tika.core :as tika])

(defn extract-text
  Extracts the text from the input stream
  [input-stream]
  (tika/parse input-stream))


Ron 

-- 
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Friday, December 5, 2014 at 2:32 AM, Divya Shravanthi wrote:

 Hi Ron,
 
 Could you please share an example of how to pull simple text from pdf/doc 
 files. I couldn't find a proper tutorial for clj-tika. 
 
 Thanks
 
 On Friday, 3 January 2014 05:03:11 UTC+5:30, Ron Toland wrote:
  If all you need is the text, you could use Apache Tika to extract it: 
  http://tika.apache.org/
  
  There's a simple clojure lib to get you started: 
  https://github.com/alexott/clj-tika
  
  I've used it to pull text out of .doc, .pdf, and .odt files.
  
  Ron
  
  On Wednesday, January 1, 2014 11:49:30 PM UTC-8, Joshua Mendoza wrote:
   Hi!,
   
   I've been looking for libraries or resources to read MS .doc files in 
   Clojure, but found none. Does anyone have tried, used, encountered or 
   witnessed such a thing to read them?
   
   I found a lot of info publicly available by the government in .doc files 
   but I want to process them automatically with Clojure.
   
   The closest thing I know is using Incanter but to read XLS files, which 
   is not useful at all for this...
   
   Well, any help would be great.
   
   Thank you! 
 
 -- 
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com 
 (mailto:clojure@googlegroups.com)
 Note that posts from new members are moderated - please be patient with your 
 first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com 
 (mailto:clojure+unsubscr...@googlegroups.com)
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 --- 
 You received this message because you are subscribed to a topic in the Google 
 Groups Clojure group.
 To unsubscribe from this topic, visit 
 https://groups.google.com/d/topic/clojure/iKDl6NHv4DU/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 clojure+unsubscr...@googlegroups.com 
 (mailto:clojure+unsubscr...@googlegroups.com).
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Read Microsoft Word .doc files in Clojure

2014-01-02 Thread Ron Toland
If all you need is the text, you could use Apache Tika to extract 
it: http://tika.apache.org/

There's a simple clojure lib to get you 
started: https://github.com/alexott/clj-tika

I've used it to pull text out of .doc, .pdf, and .odt files.

Ron

On Wednesday, January 1, 2014 11:49:30 PM UTC-8, Joshua Mendoza wrote:

 Hi!,

 I've been looking for libraries or resources to read MS .doc files in 
 Clojure, but found none. Does anyone have tried, used, encountered or 
 witnessed such a thing to read them?

 I found a lot of info publicly available by the government in .doc files 
 but I want to process them automatically with Clojure.

 The closest thing I know is using Incanter but to read XLS files, which is 
 not useful at all for this...

 Well, any help would be great.

 Thank you!


-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


[ANN] Recursd - Functional Programming Conference in San Diego

2013-12-09 Thread Ron Toland


Recursd http://recursd.com is a one day technical conference on 
functional programming.

Join other programmers and enthusiasts Saturday January 18th, 2014, to take 
part in presentations and workshops about functional programming languages 
and applications.

Recursd will be held in Central San Diego at the Ansir Innovation Center.

Registrationhttps://www.eventbrite.com/e/recursd-a-functional-programming-conference-tickets-9601475271is
 $10 before Jan 4, 2014, and $20 after.

Presenters receive free admission. To submit a proposal, contact us at: 
recu...@gmail.com

Hope to see you there!

Ron

-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure Jruby (Ruby on Rails) Interop

2013-09-10 Thread Ron Toland
Rodrigo,

We went with RabbitMQ over ZeroMQ mostly because we were more familiar with 
it. As I understand it, ZeroMQ is less of a message-queueing system and 
more sockets on steroids. Which one is best will depend pretty strongly 
on your particular usecase.

In our case, our messages are pretty simple: they're just mongodb docIds 
plus some extra data. We use the queue the message goes into as the 
meta-data for the message. If a docId appears in the word-count queue, 
for example, the Clojure workers know to calculate the avg word count stats 
for the given document.

For us, security comes in the form of username/password authentication for 
RabbitMQ, plus blocking the ports to everyone save the Ruby and Clojure 
servers that need to communicate with it. If we were firing off messages 
with more sensitive data (e.g., passwords), we'd look at encryption then.

The performance of the pipeline so far has been great. Ruby can offload the 
heavy processing to Clojure, so the user can continue working on the site 
without interruption. Clojure gets to focus on just the processing side, so 
we can optimize and scale it separately.

As for portability, we started with everything on Heroku, but we've been 
slowly migrating to aws as our needs scale up. Our configuration makes this 
really easy: we can move the rabbitmq piece over one week, then the clojure 
the next, without changing anything but a few lines in a config file.

Ron

On Monday, September 9, 2013 7:47:54 AM UTC-7, rdelcueto wrote:

 Hey Ron,
 Thanks for your response. Digging deeper into my question...

 When I read about the Torquebox Immutant duet, I thought it was 
 particularly interesting solution, because it was fairly easy to deploy and 
 both processes would live inside a JVM environment. I was impressed by how 
 Clojure data structures mapped to Ruby structures and vice-versa, it seemed 
 to provide a very clean and idiomatic messaging platform. Plus it would 
 provide tools for caching, clustering, and what not. Still I wasn't very 
 keen on the JRuby subject, since It's known to have compatibility issues 
 with certain gems.

 Yesterday while researching on the subject I found about ZeroMQ. Do you 
 have any particular reason to use RabbitMQ over other messaging libraries? 
 Are there any caveats to your interop model?
 How portable is deploying a site using a messaging solution such as 
 RabbitMQ?

 I also found out about Google's Protocol Buffers, they seemed like a 
 lightweight solution to pass language agnostic data structures through the 
 messaging infrastructure.
 Do messages need to be encapsulated somehow, or is this actually 
 unnecessary? How it's done in your case?

 Regarding security and sensible information interop; Should messages be 
 encrypted? Should they be encrypted as a whole message or partially (only 
 sensible data)?
 What are the performance implications of this pipeline? Is the overhead 
 and footprint of such setup (Ruby + Messaging Broker + ClojureJVM) big 
 enough, for it to be worth thinking on writing everything in Clojure (using 
 the Luminus framework)?

 On Monday, September 9, 2013 8:10:41 AM UTC-5, Ron Toland wrote:

 At Rewryte, we use Rails for the web frontend and Clojure for the data 
 processing backend for exactly the reasons you described.

 We use RabbitMQ to communicate between the two. This maintains separation 
 between the two apps (no JRuby required), and lets us scale them both 
 independently, while taking advantage of each language/framework's 
 strengths.



-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Clojure Jruby (Ruby on Rails) Interop

2013-09-09 Thread Ron Toland
At Rewryte, we use Rails for the web frontend and Clojure for the data 
processing backend for exactly the reasons you described.

We use RabbitMQ to communicate between the two. This maintains separation 
between the two apps (no JRuby required), and lets us scale them both 
independently, while taking advantage of each language/framework's strengths.

-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: End user applications

2013-06-13 Thread Ron Toland
At rewryte.com, we use Clojure for all our back- end data processing.

-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




Re: ANN: rewryte.com uses clojure for automated writing feedback

2013-01-27 Thread Ron Toland
Zack: It only accepts plain txt files at the moment. We're working on 
supporting other formats. :)

-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en