That's okay! It won't grow significantly bigger then 4Gb. If you shutdown
DB, the size will be a couple Mb at most. What you're seeing now, it's
included disk cache.
On Thursday, April 3, 2014 10:47:13 AM UTC-4, IQH wrote:
>
> Hello, I am new to using OrientDB. I am using version 1.7 rc2. I am
> building a graph model using the code snippet below. The code iterates
> through a directory containing *.csv files. Each directory name denotes
> and exchange name. Each exchange can contain 100s or 1000s of *.csv files.
> Each *.csv file is an instrument's name. So the desired model to build
> for this use case looks like:
>
> [vertex] exchange -->[edge] lists --> [vertex] instrument -->[edge]
> snapshot --> [vertex] date --> [edge] snapshot --> [vertex] [7 properties]
> eod
>
> For this test case I used 106 files of smaller size totaling 14 MB on
> disk. After processing with the above model the on disk database size
> (with du -hc on Mac OS X) is 3.9 GB.
>
> My concern is there are over 64,000 files to process totaling 5.33 GB of
> text data.
>
> Am I doing something wrong in the model/relationships etc. or is there an
> optimization I can use?
>
> <code snippet>
>
> *val* dir = *new* File(directory.get)
>
> *val* dirs = subdirs(dir)
>
> *var* exchange: Vertex = *null*
>
> *var* instrument: Vertex = *null*
>
> *var* eod: Vertex = *null*
>
> *var* date: Vertex = *null*
>
> *var* source: Source = *null*
>
> *var* linesIterator: Iterator[String] = *null*
>
> // Graph handle
>
> *val* graph = factory.getNoTx()
>
>
> *try* {
>
> *for* (d <- dirs) {
>
> println(*"Exchange: "* + d.getName())
>
> //Create a new vertex for each Exchange
>
> exchange = graph.addVertex()
>
> exchange.setProperty(*"name"*, d.getName())
>
> graph.getRawGraph().declareIntent(*new* OIntentMassiveInsert())
>
> //Iterate through the files in the directory
>
> *for* (f <- d.listFiles() *if*(selected.get.contains(d.getName))) {
>
> instrument = graph.addVertex()
>
> instrument.setProperty(*"symbol"*, f.getName().split(
> *""".csv"""*)(0))
>
> //Add and edge from the exchange vertex to the instrument vertex
>
> exchange.addEdge(*"lists"*, instrument)
>
> source = Source.fromFile(f)
>
> linesIterator = source.getLines()
>
> *var* count = 0
>
> //Iterate through the lines in the file
>
> *for* (v <- linesIterator) {
>
> *if* (count < 1) {
>
> count += 1
>
> } *else* {
>
> *var* data = v.split(*","*)
>
> *val* size = data.size
>
> *if* (size < 7) {
>
> *val* insert = *new* Array[String](7)
>
> *for* (i <- 0 until 7) {
>
> *if* (i >= size) {
>
> insert(i) = *""*
>
> } *else* {
>
> insert(i) = data(i)
>
> }
>
> }
>
> data = insert
>
> }
>
> date = graph.addVertex()
>
> instrument.addEdge(*"snapshots"*, date)
>
> eod = graph.addVertex()
>
> ElementHelper.setProperties(eod, *"date"*, data(0), *"open"*,
> doubleValue(data(1)).get, *"high"*
>
> ,doubleValue(data(2)).get, *"low"*,
> doubleValue(data(3)).get,
> *"close"*, doubleValue(data(4)).get
>
> , *"volume"*, longValue(data(5)).get, *"adjClose"*,
> doubleValue(data(6)).get)
>
> date.addEdge(*"measure"*, eod)
>
> date.setProperty(*"date"*, data(0))
>
> }
>
> }
>
> graph.commit()
>
> source.close()
>
> }
>
> instrument = *null*
>
> eod = *null*
>
> graph.getRawGraph().declareIntent(*null*)
>
> }
>
> }
> </code snippet>
>
> Thanks for any responses.
>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.