Re: Java out of memory problem

2010-12-20 Thread clj123
Thank Ken, your suggestion solved my problem with the OOM exception.

I tried your suggestion to run it in parallel but I didn't see much
difference. Instead I called future on the let call and that helped
the performance.

On Dec 17, 2:55 pm, Ken Wesson  wrote:
> On Fri, Dec 17, 2010 at 5:39 PM, clj123  wrote:
> > (defn persist-rows
> >  [headers rows id]
> >  (let [mrows (transform-rows rows id)]
> >    (with-db *db* (try
> >         (apply insert-into-table
> >                :my-table
> >                [:col1 :col2 :col3]
> >                mrows)))
> >     nil ))
>
> > (defn filter-data
> >  [rows item-id header id]
> >   (persist-rows
> >      header
> >      (filter #(= (:item_id %) item-id) rows)
> >      id))
>
> > (dorun (pmap #(filter-data rows %1 header %2)
> >             items id ))
>
> Rows gets traversed repeatedly, for each item/id pair. So the head
> gets held onto.
>
> To make this scale you're going to need to regenerate the rows seq for
> each traversal; you need something like
>
> (doseq [[item-id id] (map vector items ids)]
>   (let [rows (generate-rows-however)]
>     (filter-data rows item-id header id)))
>
> That's not parallel, but I don't know that parallel buys you much when
> the task is I/O bound and takes however long it takes the database
> server to process that number of requests.
>
> If parallelism really does buy you something here you could replace
> (doseq ...) with (dorun (pmap identity (for ...))) and that might
> parallelize the realization of the items, and thus all of the nested
> operations.- Hide quoted text -
>
> - Show quoted text -

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Java out of memory problem

2010-12-17 Thread Ken Wesson
On Fri, Dec 17, 2010 at 5:39 PM, clj123  wrote:
> (defn persist-rows
>  [headers rows id]
>  (let [mrows (transform-rows rows id)]
>    (with-db *db* (try
>         (apply insert-into-table
>                :my-table
>                [:col1 :col2 :col3]
>                mrows)))
>     nil ))
>
> (defn filter-data
>  [rows item-id header id]
>   (persist-rows
>      header
>      (filter #(= (:item_id %) item-id) rows)
>      id))
>
> (dorun (pmap #(filter-data rows %1 header %2)
>             items id ))

Rows gets traversed repeatedly, for each item/id pair. So the head
gets held onto.

To make this scale you're going to need to regenerate the rows seq for
each traversal; you need something like

(doseq [[item-id id] (map vector items ids)]
  (let [rows (generate-rows-however)]
(filter-data rows item-id header id)))

That's not parallel, but I don't know that parallel buys you much when
the task is I/O bound and takes however long it takes the database
server to process that number of requests.

If parallelism really does buy you something here you could replace
(doseq ...) with (dorun (pmap identity (for ...))) and that might
parallelize the realization of the items, and thus all of the nested
operations.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Java out of memory problem

2010-12-17 Thread clj123
(defn persist-rows
  [headers rows id]
  (let [mrows (transform-rows rows id)]
(with-db *db* (try
 (apply insert-into-table
:my-table
[:col1 :col2 :col3]
mrows)))
 nil ))

(defn filter-data
  [rows item-id header id]
   (persist-rows
  header
  (filter #(= (:item_id %) item-id) rows)
  id))

(dorun (pmap #(filter-data rows %1 header %2)
 items id ))

On Dec 16, 4:45 pm, Michael Ossareh  wrote:
> On Thu, Dec 16, 2010 at 09:19, clj123  wrote:
> > Hello,
>
> > I'm trying to insert in a database large number of records, however
> > it's not scaling correctly. For 100 records it takes 10 seconds, for
> > 100 records it takes 2 min to save. But for 250 records it
> > throws Java Heap out of memory exception.
>
> > I've tried separting the records processing and the actual batch save.
> > Just processing the 250 records in memory it take 30 seconds. With
> > batch insert it throws the above exception. I don't understand why
> > saving to a database it creates more Java Heap space.
>
> > Any ideas would be appreciated.
>
> What indexes are on the table that you're inserting into? To me the increase
> in time suggests your index is being rebuilt after each insert.
>
> As for the memory, I concur with zeph, you're either holding onto the head
> of a seq or you're accessing some portion of a string which is holding the
> data structures around and you're OOMing as a result.
>
> Code please :)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Java out of memory problem

2010-12-16 Thread Michael Ossareh
On Thu, Dec 16, 2010 at 09:19, clj123  wrote:

> Hello,
>
> I'm trying to insert in a database large number of records, however
> it's not scaling correctly. For 100 records it takes 10 seconds, for
> 100 records it takes 2 min to save. But for 250 records it
> throws Java Heap out of memory exception.
>
> I've tried separting the records processing and the actual batch save.
> Just processing the 250 records in memory it take 30 seconds. With
> batch insert it throws the above exception. I don't understand why
> saving to a database it creates more Java Heap space.
>
> Any ideas would be appreciated.


What indexes are on the table that you're inserting into? To me the increase
in time suggests your index is being rebuilt after each insert.

As for the memory, I concur with zeph, you're either holding onto the head
of a seq or you're accessing some portion of a string which is holding the
data structures around and you're OOMing as a result.

Code please :)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Java out of memory problem

2010-12-16 Thread zeph
You might be coming to near OOM with using in-memory processing but
don't know it, and the batched (lazy) version is probably holding onto
data creating the mem leak.  Would you be able to post the relevant
source?

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Java out of memory problem

2010-12-16 Thread clj123
Hello,

I'm trying to insert in a database large number of records, however
it's not scaling correctly. For 100 records it takes 10 seconds, for
100 records it takes 2 min to save. But for 250 records it
throws Java Heap out of memory exception.

I've tried separting the records processing and the actual batch save.
Just processing the 250 records in memory it take 30 seconds. With
batch insert it throws the above exception. I don't understand why
saving to a database it creates more Java Heap space.

Any ideas would be appreciated.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en