[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1600123752 @jrgemignani Is there a branch for the patch you mentioned ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1672522910 @jrgemignani thank you for taking a look. Indeed without creating a functional index there is no way to check if that actuality might work. If you were able to create such an index, one thing seems suspicious to me. With an index the actual execution time is longer than without it. But this probably depends on the setup. For basic queries like searching the vertex by its property we decided not to use Cypher calls, but instead usual SQL queries with functional indexes around properties values. It is much faster and we can use native FTS this way. I believe this is possible because agtype is a superset of a jsonb hence we can use it like jsonb. But for traversal queries the index should help reduce the filtering time of incoming or outgoing vertices. For example queries with scenarios - find all outgoing vertices with label 'A' from the start vertex label 'B' where some 'A''s property IN [value1, value2]. I assume with more hops involved ('A's could be chained in a hierarchy) such queries take longer without index on a specific property. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1673498025 @jrgemignani this looks promising, thank you. By the way can you get rid of using label id while creating the index definition? As I recall we've got this whole function call from the execution plan as it is. I assume this might involve changing the main logic so the code doesn't use label ID either. But when used with liquebase in the kubernetes environment this means we should precalculate the label id before we use it which is not convinient for schema initialization scripts. Ideally the index creation scripts should use predefined data if possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1675728123 @jrgemignani thank you. We will try this fix any time soon. As for the graph OID I believe we could get this OID by graph name from apache age tables and dynamically construct index creation scripts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1678847417 @jrgemignani unfortunately we were not able to test it because master branch does not compile for some reason. We usualy build release distributions inside docker based on postgres docker images. Probably something was broken recently because about a month ago master was compileable. Created a ticket for this: https://github.com/apache/age/issues/1140 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1678886239 @jrgemignani Looks like master moved to postgres 15, we compiled the master and started checking. https://github.com/apache/age/issues/1140 closed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1680176935 @jrgemignani indexes work fine for cypher equality operator but we were not able to use the index for Cypher IN operator. For example: ``` ALTER FUNCTION agtype_in_operator IMMUTABLE; CREATE INDEX profile_pet_btree_check_in_operator ON test_graph."profile" USING BTREE ( ag_catalog.agtype_in_operator('["dog", "cat"]'::agtype, agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(id, _label_name('17580'::oid, id), properties), '"pet"'::agtype])) ); ``` The query: ``` select any_profile from ag_catalog.cypher('test_graph',$$ EXPLAIN ANALYZE MATCH (any_profile:`profile` { hidden: false }) WHERE any_profile.pet IN ['dog', 'cat'] RETURN any_profile $$ ) as (any_profile ag_catalog.agtype); ``` For some reason the optimizer does not involve the index even if we try to multiple data ten times. Is it safe to change the volatility flag for ``` agtype_in_operator ``` function ? | QUERY PLAN | | :--- | | Gather \(cost=1000.00..11311.91 rows=133 width=32\) \(actual time=0.616..345.966 rows=133994 loops=1\) | | Workers Planned: 2 | | Workers Launched: 2 | | -> Parallel Seq Scan on profile any\_profile \(cost=0.00..10298.61 rows=55 width=32\) \(actual time=0.254..331.683 rows=44665 loops=3\) | | Filter: \(\(properties @> agtype\_build\_map\('hidden'::text, 'false'::agtype\)\) AND agtype\_in\_operator\('\["dog", "cat"\]'::agtype, agtype\_access\_operator\(VARIADIC ARRAY\[\_agtype\_build\_vertex\(id, \_label\_name\('17580'::oid, id\), properties\), '"pet"'::agtype\]\)\)\) | | Rows Removed by Filter: 88669 | | Planning Time: 0.295 ms | | Execution Time: 350.271 ms | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1687762386 @jrgemignani Does it mean it requires additional development ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1689461877 @dehowef thanks for taking a look at this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1700930100 @dehowef Thank you for paying attention to this. Right now, the "==" operator also involves functional calls, so the only way to kick out the index is to reverse engineer the execution plan and extract the function itself in order to create the index. Ideally, the WHERE should not use functional indexes at all, like MATCH does not for GIN indexes, for two reasons: - It could look like magic, hence the need to extract the chain of functions calls from the plan. - If functions definitions change, the indexes stop working. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1705124416 @dehowef I believe for now only IN operator. But thanks to @jrgemignani the fix for "=" operator is already in version 1.4.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1711578142 @rafsun42 thank you. We will check this next week and let you know. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1713595065 @rafsun42 we checked. It works on my machine. This could be a good solution -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1723307224 @jrgemignani Thanks for sharing the PR. I have built that branch to see how it could affect our typical queries. It turns out that for our use cases, we do not see much difference. Usually, our typical queries are like this: ``` select properties from ag_catalog.cypher('test_graph', $$ //EXPLAIN ANALYZE MATCH (start_vertex:`scope` {hidden: false}) WHERE start_vertex.name = 'scope_1' OR start_vertex.name = 'scope_2' OR start_vertex.name = 'scope_3' OR start_vertex.name = 'scope_4' OR start_vertex.name = 'scope_5' OR start_vertex.name = 'scope_6' RETURN { properties: properties(start_vertex), ___edge_reserved: properties(null) } UNION MATCH (from:`scope` {hidden: false}) -[e1:`edges`*0..5 {hidden: false}]->(to1:`role` {hidden: false}) -[e2:`edges`*1..1 {hidden: false, fromCollection: 'role', toCollection: 'permission'}]->(to2:`permission` {hidden: false}) -[e3:`edges`*1..1 {hidden: false, fromCollection: 'permission', toCollection: 'resource'}]->(to3:`resource` {hidden: false}) WHERE from.name = 'scope_1' OR from.name = 'scope_2' OR from.name = 'scope_3' OR from.name = 'scope_4' OR from.name = 'scope_5' OR from.name = 'scope_6' WITH collect({ properties: properties(to1), ___edge_reserved: properties(last(e1)) }) + collect({ properties: properties(to2), ___edge_reserved: properties(last(e2)) }) + collect({ properties: properties(to3), ___edge_reserved: properties(last(e3)) }) as result UNWIND result as result_out RETURN DISTINCT result_out LIMIT 1000 $$) as (properties ag_catalog.agtype); ``` As you can see we use only one table (e_label) for edges and structures like ``` {hidden: false, fromCollection: 'role', toCollection: 'permission'} ``` improve the execution time because there is a call to a GIN index that cuts unnecessary edges. That also helps simplify backend logic. So, such a query: ``` select properties from ag_catalog.cypher('test_graph', $$ //EXPLAIN ANALYZE MATCH (from:`domain` {hidden: false})-[any_edge:`edges`*1..10 {hidden: false, fromCollection: 'domain', toCollection: 'domain', relation: 'managed'}]->(to:`domain` {hidden: false, id: 'domain_root_1'}) WITH from, any_edge, to, head(any_edge) AS edge_check_in, head(any_edge) AS edge_result_in RETURN DISTINCT { properties: properties(from), ___edge_reserved: properties(edge_result_in) } LIMIT 1000 $$) as (properties ag_catalog.agtype); ``` works even faster than the version with lots of edge tables. ``` select properties from ag_catalog.cypher('test_graph', $$ //EXPLAIN ANALYZE MATCH (from:`domain` {hidden: false})-[any_edge:`E_DOMAIN_MANAGED_DOMAIN`*1..10 {hidden: false}]->(to:`domain` {hidden: false, id: 'domain_root_1'}) WITH from, any_edge, to, head(any_edge) AS edge_check_in, head(any_edge) AS edge_result_in RETURN DISTINCT { properties: properties(from), ___edge_reserved: properties(edge_result_in) } LIMIT 1000 $$) as (properties ag_catalog.agtype); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1723616635 @jrgemignani I believe @rafsun42 also suggested the solution for IN operator. I can move the issue with "IN" operator to another ticket. Let me know if it is convenient to do so or leave this issue open until the solution for "IN" operator is implemented. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [age] vladiksun commented on issue #1000: Index is not used in the WHERE clause
vladiksun commented on issue #1000: URL: https://github.com/apache/age/issues/1000#issuecomment-1723634217 @jrgemignani, @rafsun42 ok I will create a new issue for the proposed solution for IN operator and link it to the this one. Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org