Re: [h2] "SELECT ... WHERE ... IN" on extremely large tables

2019-10-07 Thread Noel Grandin
On Mon, 7 Oct 2019 at 20:58, Tim Fielder wrote: > I also wanted to ask about performance differences across database modes. > Right now I'm constructing the database in file mode and just storing it on > the same machine that's running the application. I can't set up an actual > server to

Re: [h2] "SELECT ... WHERE ... IN" on extremely large tables

2019-10-07 Thread Tim Fielder
The suggestion of populating the tables via the CREATE TABLE ... AS SELECT DISTINCT is a good one so far. It's nearly a 30% performance improvement compared to the previous method, which cuts about 7 whole days off of the expected time to complete populating the database. I'm currently

Re: [h2] "SELECT ... WHERE ... IN" on extremely large tables

2019-10-07 Thread Tim Fielder
O(n log n) sounds about right. Based on the most recent results I get this formula for the expected time to process X files of 10,000 articles each. With an x^2 term of 0.21

[h2] Re: "SELECT ... WHERE ... IN" on extremely large tables

2019-10-07 Thread Tim Fielder
I'm using version 1.4.199 for this. That's good to know though. On Monday, October 7, 2019 at 7:51:29 AM UTC-4, Evgenij Ryazanov wrote: > > BTW, what version of H2 do you use? > > IN (query) condition needs a lot of memory in 1.4.197 and older versions. > -- You received this message because

[h2] Re: "SELECT ... WHERE ... IN" on extremely large tables

2019-10-07 Thread Evgenij Ryazanov
BTW, what version of H2 do you use? IN (query) condition needs a lot of memory in 1.4.197 and older versions. -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: [h2] "SELECT ... WHERE ... IN" on extremely large tables

2019-10-07 Thread Noel Grandin
On 2019/10/07 1:32 PM, Tim Fielder wrote:> The problem with this approach is that since the tables are indexed, the insert time grows quadratically with the size > of the table. As a result I can handle 230,000 articles in about 2 hours, but the full 46.7 million will take at least > 300

[h2] "SELECT ... WHERE ... IN" on extremely large tables

2019-10-07 Thread Tim Fielder
I'm trying to develop an application that parses scientific journal publications from a few sources and collects them in a single database. The problem is that there's just too damn many of them and it creates a catch-22 situation where I either need literally years to build the database, or

[h2] Re: Failed to change "not null" option on a column on version 1.4.199

2019-10-07 Thread Evgenij Ryazanov
Hello. Your command is definitely not valid. Where did you find it? We don't have such syntax in the documentation. It was accepted due to bug. Use ALTER TABLE tableName ALTER COLUMN columnName SET NOT NULL instead. The SET DATA TYPE clause may be used only to change the data type of the

[h2] Failed to change "not null" option on a column on version 1.4.199

2019-10-07 Thread nzanaga
Hi, with driver version "1.4.199" query to alter column to modify not null options fails. Using versione "1.4.197" this works fine. Query: CREATE TABLE test1 ( a integer ) ALTER TABLE test1 ALTER COLUMN a SET DATA TYPE INTEGER NOT NULL -- You received this message because you are