Hi Matt, Thank you for the updated information on performance testing, I give that a try! I am now on v1.1.0 and have attached a screen shot of a File/Open operation from the GUI that neither performs a listing of S3 or errors. I have a credentials file in my ~/.aws/ which has several profiles that have access to S3. Is there a was to configure which profile Hop should use? Thank you for your help in getting S3 connections working.
Regards, David On Wed, Jan 26, 2022 at 12:42 PM Matt Casters <[email protected]> wrote: > I am excited to be using HOP. My intent is to use HOP to ETL my Neo4j >> loading and even GDS processing. So far I have build a knowledgegraph and >> ontology via hop using local files but want to schedule/automate the >> process from S3. After I get that working I will move on to considering how >> best to write unittest post Neo4j loading. I saw the unittest feature but >> do not think it will meet my use case where I want to run a cypher query >> checking for orphaned nodes for example and assert that the count is 0. > > > First of all, there have been a number of improvements to the Neo4j > plugins in 1.1.0, in particular to the Neo4j Graph Output transform. > Second, we run integration tests against a Neo4j docker container every > night with unit tests. > > > https://ci-builds.apache.org/job/Hop/job/Hop-integration-tests/lastCompletedBuild/testReport/(root)/neo4j/ > > The workflows and pipelines for that are located here: > https://github.com/apache/hop/tree/master/integration-tests/neo4j > > So in your case you would either run the count in Neo4j or in Hop and > compare to a golden record with 0 in it. Or you can pass any output to an > Abort transform... There are many ways to test these things. > > Cheers, > Matt > > On Wed, Jan 26, 2022 at 8:12 PM David Hughes <[email protected]> > wrote: > >> Hi Matt, >> >> Wow, thank you for responding so quickly, and in person! I am on v1.0.0 >> (congratulations btw). I followed the docs and receive the error message >> that I described. >> >> Error browsing to location: >> 's3://octave-domo-data/patientgraph/reference/ccs_dx_icd10cm_2019_1.csv' >> FileNotFolderException: Could not list the contents of >> "file:///Users/davidhughes/servers/hop/s3:/octave-domo-data/patientgraph/reference" >> because it is not a folder. >> Root cause: FileNotFolderException: Could not list the contents of >> "file:///Users/davidhughes/servers/hop/s3:/octave-domo-data/patientgraph/reference" >> because it is not a folder. >> >> I am excited to be using HOP. My intent is to use HOP to ETL my Neo4j >> loading and even GDS processing. So far I have build a knowledgegraph and >> ontology via hop using local files but want to schedule/automate the >> process from S3. After I get that working I will move on to considering how >> best to write unittest post Neo4j loading. I saw the unittest feature but >> do not think it will meet my use case where I want to run a cypher query >> checking for orphaned nodes for example and assert that the count is 0. >> >> Thank you for your insights on how to get S3 reading working in v1.0.0 >> >> Regards, >> >> David >> >> On Wed, Jan 26, 2022 at 11:02 AM Matt Casters <[email protected]> >> wrote: >> >>> Hi David, >>> >>> Unfortunately version 1.0.0 had a missing AWS library. It was >>> a packaging bug. >>> But a little bird told me that there's a newer version online at >>> https://hop.apache.org/download/ >>> So if you could try that one you'll probably be more successful. >>> >>> If you're on 1.1.0 already then the docs are at: >>> https://hop.apache.org/manual/latest/vfs/aws-s3-vfs.html >>> Maybe those can help. >>> >>> Good luck! >>> >>> Matt >>> >>> On Wed, Jan 26, 2022 at 6:57 PM David Hughes <[email protected]> >>> wrote: >>> >>>> I have AWS IAM credentials in ~/.aws on my mac and tried to access a >>>> csv by choosing file/open and entering s3:// and refreshing. I get a file >>>> not found error indicating the HOP is looking in my local file system. Has >>>> anyone been able to get S3 file reading configured and working properly? I >>>> am appreciative of any insight you can provide. >>>> >>>> -- >>>> David Hughes >>>> >>> >>> >>> -- >>> Neo4j Chief Solutions Architect >>> *✉ *[email protected] >>> >>> >>> >>> >> >> -- >> David Hughes >> Platform Architect >> Octave Bioscience >> www.octavebio.com >> >> > > -- > Neo4j Chief Solutions Architect > *✉ *[email protected] > > > > -- David Hughes Platform Architect Octave Bioscience www.octavebio.com
