I agree with Igor - I would either make sure session is ThreadLocal or, more simply, why not create the session at the start of the saveInBatch method and close it at the end? Creating a SessionFactory is an expensive operation but creating a Session is a relatively cheap one. On 6 Sep 2015 07:27, "Igor Berman" <igor.ber...@gmail.com> wrote:
> how do you create your session? do you reuse it across threads? how do you > create/close session manager? > look for the problem in session creation, probably something deadlocked, > as far as I remember hib.session should be created per thread > > On 6 September 2015 at 07:11, Zoran Jeremic <zoran.jere...@gmail.com> > wrote: > >> Hi, >> >> I'm developing long running process that should find RSS feeds that all >> users in the system have registered to follow, parse these RSS feeds, >> extract new entries and store it back to the database as Hibernate >> entities, so user can retrieve it. I want to use Apache Spark to enable >> parallel processing, since this process might take several hours depending >> on the number of users. >> >> The approach I thought should work was to use >> *useridsRDD.foreachPartition*, so I can have separate hibernate session >> for each partition. I created Database session manager that is initialized >> for each partition which keeps hibernate session alive until the process is >> over. >> >> Once all RSS feeds from one source are parsed and Feed entities are >> created, I'm sending the whole list to Database Manager method that saves >> the whole list in batch: >> >>> public <T extends BaseEntity> void saveInBatch(List<T> entities) { >>> try{ >>> boolean isActive = session.getTransaction().isActive(); >>> if ( !isActive) { >>> session.beginTransaction(); >>> } >>> for(Object entity:entities){ >>> session.save(entity); >>> } >>> session.getTransaction().commit(); >>> }catch(Exception ex){ >>> if(session.getTransaction()!=null) { >>> session.getTransaction().rollback(); >>> ex.printStackTrace(); >>> } >>> } >>> >>> However, this works only if I have one Spark partition. If there are two >> or more partitions, the whole process is blocked once I try to save the >> first entity. In order to make the things simpler, I tried to simplify Feed >> entity, so it doesn't refer and is not referred from any other entity. It >> also doesn't have any collection. >> >> I hope that some of you have already tried something similar and could >> give me idea how to solve this problem >> >> Thanks, >> Zoran >> >> >