Re: Creating Nodes in Bulk,

Muhammad Shoaib Wed, 01 Sep 2021 23:25:33 -0700

HI Josh, Hi All,

Thanks for the response.


Going forward with option 1, would you suggest inserting values directly
into tables or constructing cypher commands.

Kind regards
Shoaib

On Thu, Sep 2, 2021 at 12:48 PM Josh Innis <josh.in...@bitnine.net> wrote:

> Hi all,
>
> Upon thinking about this more I realized there is an underlying issue that
> needs to be discussed that directly affects the approach needed on this
> feature. To do so: I need to talk about an early design mistake that was
> made on this project.
>
> Back in 2019 when this project was still called Agensgraph-Extension and
> there was no consideration to donate to Apache, the project ignored a basic
> rule of programming: don't use deprecated features. In particular: the use
> of OIDs by non pg_catalog tables. OID stands for Object Id. In short,
> Postgres uses OIDs to assign a unique value to a row that no other system
> table row has. In earlier versions of Postgres they gave the right of the
> user to mark rows in other tables as "system" tables and give them an oid
> (An oid that no POSTGRES system table had either).
>
> This feature was deprecated in version Postgres 8.5. We started development
> of AGE on Postgres 11. To try and summarize a decision that was made in the
> first 3 months of a project. Some of the people involved (but not all) are
> no longer working on the project thought: "The reason why they did this is
> because they don't want systems built on top of Postgres to consider
> themselves system tables. We are not technically a 'system built on top of
> Postgres.' We are an extension of Postgres. Therefore our metadata tables
> are technically system tables. Therefore our unique identifiers should be
> unique in all of Postgres. Therefore we should use OIDs... Oh, it's
> deprecated. We should still use them. Why not, what's the worst that could
> happen?"
>
> The worst that could happen is Postgres removes the functionality of adding
> Oids to users tables (which we are) in Postgres 12. They removed the
> functionality of 'WITH OIDS' from their SQL grammar implementation and they
> removed Oid from being returned on certain Postgres internal C functions
> that they used to give us. This is why AGE is incompatible with Postgres
> 12+ (NOTE: possibly other things too but this is definitely THE issue
> causing us problems).
>
> Our goal since that point has been to implement as much of the openCypher
> spec as possible. Which is why we have not dealt with this problem yet. The
> good news is: none of the features implemented since the point we realized
> this mistake have made the job of changing AGE to fix this take more time.
>
> The reason why I am bringing this up now in this email is: Option #1 is
> clearly the superior option. We should funnel as much functionality that we
> can through the ANSI SQL and the openCypher specification that we can. The
> internal C functions of Postgres are not part of those standards.
> Therefore, this bulk import tool needs to create dynamic sql queries on a
> row per row basis, rather than use the internal Postgres functions to
> bypass some of those standards. Even though it is slower.
>
> With all of that said: we need to deal with this issue soon. In other
> orders: soon == 2021.
>
> TLDR: Option #1 is better.
> We create value for our clients by connecting the world's data.
> Josh Innis Software Engineer
> Core R&D Team
> *P*: (831)278-0327
> 3945 Freedom Circle #260, Santa Clara, CA 95054
> <
> https://www.google.com/maps/place/3945+Freedom+Cir+%23260,+Santa+Clara,+CA+95054/
> >
> bitnine.net
>
>
> On Wed, Sep 1, 2021 at 10:16 AM Josh Innis <josh.in...@bitnine.net> wrote:
>
> > Muhammed,
> >
> > Directly calling the functions that insert data into the vertex and edge
> > tables might not be the best way. They are tightly integrated with the
> > Postgres stages of execution (Parse, Transform, Optimize, and Execution).
> > There are two options I would recommend:
> >
> >
> >    1. Dynamically create either a Cypher CREATE query or create an SQL
> >    insert query.
> >    2. Take a look at how we insert data into our ag_catalog tables
> >    ag_graph and ag_label. Both are located in src/backend/catalog. This
> would
> >    decouple a great amount of our insertion logic from the Postgres
> (Though
> >    obviously not completely). If you take a look at agtype_build_map in
> >    src/backend/utils/adt/agtype.c that would give you an idea on how to
> create
> >    the property map from the spreadsheet.
> >
> > Both solutions have potential drawbacks. Dynamically creating a
> cypher/sql
> > query will be slower however, you are using the standard interface
> provided
> > by both us and Postgres (via the SQL and openCypher language
> > specifications). Which would not be the case in the 2nd option, however
> > that option could decouple some logic that might be useful for later
> > features.
> >
> > We create value for our clients by connecting the world's data.
> > Josh Innis Software Engineer
> > Core R&D Team
> > *P*: (831)278-0327
> > 3945 Freedom Circle #260, Santa Clara, CA 95054
> > <
> https://www.google.com/maps/place/3945+Freedom+Cir+%23260,+Santa+Clara,+CA+95054/
> >
> > bitnine.net
> >
> >
> > On Wed, Sep 1, 2021 at 9:42 AM Muhammad Shoaib <sho...@bitnine.net>
> wrote:
> >
> >> Hi Folks,
> >>
> >> I am trying to write a function to load nodes from CSV into AGE. I have
> >> data into C data structure but don't know how to proceed further. The
> >> particular questions are
> >>
> >> Which AGE data structure I need to use
> >> which function I need to call in order to insert data into PG Tables.
> >>
> >> Looking forward to hearing from you.
> >>
> >> Kind regards
> >> Shoaib
> >>
> >
>

Re: Creating Nodes in Bulk,

Reply via email to