Re: Problem to persist Hibernate entity from Spark job

Matthew Johnson Sun, 06 Sep 2015 01:40:49 -0700

I agree with Igor - I would either make sure session is ThreadLocal or,
more simply, why not create the session at the start of the saveInBatch
method and close it at the end? Creating a SessionFactory is an expensive
operation but creating a Session is a relatively cheap one.
On 6 Sep 2015 07:27, "Igor Berman" <igor.ber...@gmail.com> wrote:


> how do you create your session? do you reuse it across threads? how do you
> create/close session manager?
> look for the problem in session creation, probably something deadlocked,
> as far as I remember hib.session should be created per thread
>
> On 6 September 2015 at 07:11, Zoran Jeremic <zoran.jere...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I'm developing long running process that should find RSS feeds that all
>> users in the system have registered to follow, parse these RSS feeds,
>> extract new entries and store it back to the database as Hibernate
>> entities, so user can retrieve it. I want to use Apache Spark to enable
>> parallel processing, since this process might take several hours depending
>> on the number of users.
>>
>> The approach I thought should work was to use
>> *useridsRDD.foreachPartition*, so I can have separate hibernate session
>> for each partition. I created Database session manager that is initialized
>> for each partition which keeps hibernate session alive until the process is
>> over.
>>
>> Once all RSS feeds from one source are parsed and Feed entities are
>> created, I'm sending the whole list to Database Manager method that saves
>> the whole list in batch:
>>
>>> public  <T extends BaseEntity> void saveInBatch(List<T> entities) {
>>>     try{
>>>       boolean isActive = session.getTransaction().isActive();
>>>         if ( !isActive) {
>>>             session.beginTransaction();
>>>         }
>>>        for(Object entity:entities){
>>>          session.save(entity);
>>>         }
>>>        session.getTransaction().commit();
>>>      }catch(Exception ex){
>>>     if(session.getTransaction()!=null) {
>>>         session.getTransaction().rollback();
>>>         ex.printStackTrace();
>>>    }
>>>   }
>>>
>>> However, this works only if I have one Spark partition. If there are two
>> or more partitions, the whole process is blocked once I try to save the
>> first entity. In order to make the things simpler, I tried to simplify Feed
>> entity, so it doesn't refer and is not referred from any other entity. It
>> also doesn't have any collection.
>>
>> I hope that some of you have already tried something similar and could
>> give me idea how to solve this problem
>>
>> Thanks,
>> Zoran
>>
>>
>

Re: Problem to persist Hibernate entity from Spark job

Reply via email to