Hello!
I suggest that you check those possibilities out:
Does performance increase dramatically if you need it on 10% of data, i.e.,
~1 million records?
Does something change when you have only one client connected?
Note that I was running this example on a single node so it should not be
hard
Actually there's only one row in b.
SELECT COUNT(*) FROM b where x = '1';
COUNT(*) 1
1 row selected (0.003 seconds)
Maybe because the join performance drops dramatically when the data size is
more than 10 million or cluster has a lot of clients connected?
My 6 node cluster has 10 clients
Hello!
I have indeed try a use case like yours:
0: jdbc:ignite:thin://127.0.0.1/> create index on b(x,y);
No rows affected (9,729 seconds)
0: jdbc:ignite:thin://127.0.0.1/> select count(*) from a;
COUNT(*) 1
1 row selected (0,017 seconds)
0: jdbc:ignite:thin://127.0.0.1/> select count(*) from
Here's the detailed information for my join test.
0: jdbc:ignite:thin://sap-datanode6/> select * from a;
x 1
y 1
A bearbrick
1 row selected (0.002 seconds)
0: jdbc:ignite:thin://sap-datanode6/> select count(*) from b;
COUNT(*) 14337959
1 row selected (0.299 seconds)
0:
Ray,
This sounds suspicious. Please show your configuration and the execution
plan for the query.
-Val
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Hello!
Can you show the index that you are creating here?
Regards,
--
Ilya Kasnacheev
вт, 25 сент. 2018 г. в 8:23, Ray :
> Let's say I have two tables I want to join together.
> Table a has around 10 millions of rows and it's primary key is x and y.
> I have created index on field x and y
Let's say I have two tables I want to join together.
Table a has around 10 millions of rows and it's primary key is x and y.
I have created index on field x and y for table a.
Table b has one row and it's primary key is x and y.
The primary key for that row in table b has a correspondent row in
If join is indexed and collocated, it still can be pretty fast. Do you have a
particular query that is slower with optimization than without?
-Val
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Hi Val, thanks for the reply.
I'll try again and let you know if I missed something.
By "Ignite is not optimized for join", I mean currently Ignite only supports
nest loop join which is very inefficient when joining two large table.
Please refer to these two tickets for details.
Ray,
Per my understanding, pushdown filters are propagated to Ignite either way,
it's not related to the "optimization". Optimization affects joins,
gropings, aggregations, etc. So, unless I'm missing something, the behavior
you're looking for is achieved by setting
Hi,
I am not sure that it will work but you can try next:
SparkSession spark = SparkSession
.builder()
.appName("SomeAppName")
.master("spark://10.0.75.1:7077")
.config(OPTION_DISABLE_SPARK_SQL_OPTIMIZATION, "false") //or
true
Currently, OPTION_DISABLE_SPARK_SQL_OPTIMIZATION option can only be set on
spark session level.
It means I can only have Ignite optimization or Spark optimization for one
Spark job.
Let's say I want to load data into spark memory with pushdown filters using
Ignite optimization.
For example, I
12 matches
Mail list logo