Hey Ron, Thanks for volunteering to help trouble shoot this. I actually lost track of this, but could you file a bug at issues.apache.org/jira under the sqoop project describing what you're seeing? I think you may have found a legitimate load bug with the new parquet feature.
-Abe On Wed, Jul 29, 2015 at 6:06 PM, Ron Gonzalez <[email protected]> wrote: > Hi Abe, > I was able to run sqoop from the command line using LocalJobRunner so I > can avoid having to deal with the container memory issues. > I will try and run it overnight and see if it can complete if it has > access to my entire memory. > Since I'm now running it as a local process, I should be able to do > whatever kind of stuff you would need in order to understand what's going > on. > I am also able to debug it in Eclipse so I'm going to see if I can > figure it out myself as well, and report back my findings... > > Thanks, > Ron > > On 07/22/2015 03:33 PM, Abraham Elmahrek wrote: > > Hey man, > > Is there a stack trace or core dump you could provide? You might be > right, but there's no way for us to validate that. > > The problem of compacting several files is definitely an issue. This is > the topic of https://issues.apache.org/jira/browse/SQOOP-1094. It's a > great idea to add Parquet support as well I feel. > > -Abe > > On Tue, Jul 21, 2015 at 11:19 PM, Ron Gonzalez <[email protected]> > wrote: > >> Hi, >> Quick question on the parquet support for sqoop import. >> I am finding that while trying to load a million row table, I can never >> get the map-reduce job to complete because the containers keep getting >> killed. I have already set the container size to be 2 GB and also changed >> the mapreduce java opts to be -Xmx2048m. >> Is there some configuration that I can set to address this? >> I believe the problem is due to the fact that for a parquet file with a >> lot of rows, then we have to keep the column data into memory before we can >> flush it to file so the larger the number of rows, the larger the amount of >> memory required before we can flush it to file. I'm open to creating >> smaller parquet files but I'm going to end up with a lot of parquet files >> in the process. >> Any suggestions? >> >> Thanks, >> Ron >> > > >
