[jira] [Resolved] (CASSANDRA-7271) Bulk data loading in Cassandra causing OOM

2014-05-25 Thread Prasanth Gullapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Gullapalli resolved CASSANDRA-7271.


Resolution: Not a Problem

My sincere apologies, Michael Shuler. Initially I thought it was some problem 
with the latest code I have taken. But then I found out that the issue was with 
my environment and hence closing the issue. 

> Bulk data loading in Cassandra causing OOM
> --
>
> Key: CASSANDRA-7271
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7271
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Prasanth Gullapalli
>Assignee: Michael Shuler
> Attachments: BulkLoadFiles.zip, Classes.zip
>
>
> I am trying to load data from a csv file into Cassandra table using 
> SSTableSimpleUnsortedWriter. As the latest maven cassandra dependencies have 
> some issues with it, I have taken the _next_ beta (rc) version cut as 
> suggested in CASSANDRA-7218. But after taking it, I am facing issues with 
> bulk data loading
> Here is the piece of code which loads data:
> {code:java}
> public void loadData(TableDefinition tableDefinition, InputStream 
> csvInputStream){
> createDataInDBFormat(tableDefinition, csvInputStream);
> Path dbFilePath = Paths.get(TEMP_DIR, keyspace, 
> tableDefinition.getName());
> //BulkLoader.main(new 
> String[]{"-d","localhost",dbFilePath.toUri().getPath()});
> try {
> JMXServiceURL jmxUrl = new JMXServiceURL(String.format(
> "service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", cassandraHost, 
> cassandraJMXPort));
> JMXConnector connector = JMXConnectorFactory.connect(jmxUrl, new 
> HashMap());
> MBeanServerConnection mbeanServerConn = 
> connector.getMBeanServerConnection();
> ObjectName name = new 
> ObjectName("org.apache.cassandra.db:type=StorageService");
> StorageServiceMBean storageBean = JMX.newMBeanProxy(mbeanServerConn, 
> name, StorageServiceMBean.class);
> storageBean.bulkLoad(dbFilePath.toUri().getPath());
> connector.close();
> } catch (IOException | MalformedObjectNameException e) {
> e.printStacktrace()
> }
> FileUtils.deleteQuietly(dbFilePath.toFile());
> }
> private void createDataInDBFormat(TableDefinition tableDefinition, 
> InputStream csvInputStream) {
> try(Reader reader = new InputStreamReader(csvInputStream)){
> String tableName = tableDefinition.getName();
> File directory = Paths.get(TEMP_DIR, keyspace, tableName).toFile();
> directory.mkdirs();
> String yamlPath = 
> "file:\\"+CASSANDRA_HOME+File.separator+"conf"+File.separator+"cassandra.yaml";
> System.setProperty("cassandra.config", yamlPath);
> SSTableSimpleUnsortedWriter writer = new SSTableSimpleUnsortedWriter(
> directory, new Murmur3Partitioner(), keyspace,
> tableName, AsciiType.instance, null, 10);
> long timestamp = System.currentTimeMillis() * 1000;
> CSVReader csvReader = new CSVReader(reader);
> String[] colValues = null;
> List columnDefinitions = 
> tableDefinition.getColumnDefinitions();
> while((colValues = csvReader.readNext()) != null){
> if(colValues.length != 0){
> writer.newRow(bytes(colValues[0]));
> for(int index = 1; index< colValues.length; index++){
> ColumnDefinition columnDefinition = 
> columnDefinitions.get(index);   
> writer.addColumn(bytes(columnDefinition.getName()), 
> bytes(colValues[index]), timestamp);
> }
> }
> }
> csvReader.close();
> writer.close();
> } catch (IOException e) {
> e.printStacktrace();
> }
> }
> {code}
> On trying to run loadData, it is giving me the following exception:
> {code:xml}
> 11:23:18.035 [45742123@qtp-1703018180-0] ERROR 
> com.adaequare.common.config.TransactionPerRequestFilter.doInTransactionWithoutResult
>  39 - Problem in executing request : 
> [http://localhost:8081/mapro-engine/rest/masterdata/pumpData]. Cause 
> :org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: 
> Java heap space
> javax.servlet.ServletException: 
> org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: 
> Java heap space
> at 
> org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:392) 
> ~[jersey-container-servlet-core-2.6.jar:na]
> at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:381)
>  ~[jersey-container-servlet-core-2.6.jar:na]
> at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:344)
>  ~[jersey-container-servlet-core-2.6.jar:na]
> at 
> org.glassfish.jersey.servlet.ServletContai

[jira] [Resolved] (CASSANDRA-7271) Bulk data loading in Cassandra causing OOM

2014-05-20 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-7271.
---

Resolution: Not a Problem

the Unsorted writer has to buffer everything in memory until you are done, to 
sort it.  Write less rows at a time, or use the Sorted writer instead.

> Bulk data loading in Cassandra causing OOM
> --
>
> Key: CASSANDRA-7271
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7271
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Prasanth Gullapalli
>
> I am trying to load data from a csv file into Cassandra table using 
> SSTableSimpleUnsortedWriter. As the latest maven cassandra dependencies have 
> some issues with it, I have taken the _next_ beta (rc) version cut as 
> suggested in CASSANDRA-7218. But after taking it, I am facing issues with 
> bulk data loading
> Here is the piece of code which loads data:
> {code:java}
> public void loadData(TableDefinition tableDefinition, InputStream 
> csvInputStream){
> createDataInDBFormat(tableDefinition, csvInputStream);
> Path dbFilePath = Paths.get(TEMP_DIR, keyspace, 
> tableDefinition.getName());
> //BulkLoader.main(new 
> String[]{"-d","localhost",dbFilePath.toUri().getPath()});
> try {
> JMXServiceURL jmxUrl = new JMXServiceURL(String.format(
> "service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", cassandraHost, 
> cassandraJMXPort));
> JMXConnector connector = JMXConnectorFactory.connect(jmxUrl, new 
> HashMap());
> MBeanServerConnection mbeanServerConn = 
> connector.getMBeanServerConnection();
> ObjectName name = new 
> ObjectName("org.apache.cassandra.db:type=StorageService");
> StorageServiceMBean storageBean = JMX.newMBeanProxy(mbeanServerConn, 
> name, StorageServiceMBean.class);
> storageBean.bulkLoad(dbFilePath.toUri().getPath());
> connector.close();
> } catch (IOException | MalformedObjectNameException e) {
> e.printStacktrace()
> }
> FileUtils.deleteQuietly(dbFilePath.toFile());
> }
> private void createDataInDBFormat(TableDefinition tableDefinition, 
> InputStream csvInputStream) {
> try(Reader reader = new InputStreamReader(csvInputStream)){
> String tableName = tableDefinition.getName();
> File directory = Paths.get(TEMP_DIR, keyspace, tableName).toFile();
> directory.mkdirs();
> String yamlPath = 
> "file:\\"+CASSANDRA_HOME+File.separator+"conf"+File.separator+"cassandra.yaml";
> System.setProperty("cassandra.config", yamlPath);
> SSTableSimpleUnsortedWriter writer = new SSTableSimpleUnsortedWriter(
> directory, new Murmur3Partitioner(), keyspace,
> tableName, AsciiType.instance, null, 10);
> long timestamp = System.currentTimeMillis() * 1000;
> CSVReader csvReader = new CSVReader(reader);
> String[] colValues = null;
> List columnDefinitions = 
> tableDefinition.getColumnDefinitions();
> while((colValues = csvReader.readNext()) != null){
> if(colValues.length != 0){
> writer.newRow(bytes(colValues[0]));
> for(int index = 1; index< colValues.length; index++){
> ColumnDefinition columnDefinition = 
> columnDefinitions.get(index);   
> writer.addColumn(bytes(columnDefinition.getName()), 
> bytes(colValues[index]), timestamp);
> }
> }
> }
> csvReader.close();
> writer.close();
> } catch (IOException e) {
> e.printStacktrace();
> }
> }
> {code}
> On trying to run loadData, it is giving me the following exception:
> {code:xml}
> 11:23:18.035 [45742123@qtp-1703018180-0] ERROR 
> com.adaequare.common.config.TransactionPerRequestFilter.doInTransactionWithoutResult
>  39 - Problem in executing request : 
> [http://localhost:8081/mapro-engine/rest/masterdata/pumpData]. Cause 
> :org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: 
> Java heap space
> javax.servlet.ServletException: 
> org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: 
> Java heap space
> at 
> org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:392) 
> ~[jersey-container-servlet-core-2.6.jar:na]
> at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:381)
>  ~[jersey-container-servlet-core-2.6.jar:na]
> at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:344)
>  ~[jersey-container-servlet-core-2.6.jar:na]
> at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:219)
>  ~[jersey-container-servlet-core-2.6.jar:na]
> at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHol