it's possible that some of the commands are not erroring gracefully for missing parameters?
hudi:tablename->savepoint create for eg, would need a commit time for creating the savepoint, if you are able to connect to the dataset, then it should all be working, On Wed, Sep 9, 2020 at 3:27 AM Pratyaksh Sharma <[email protected]> wrote: > Hi Adam, > > I have not used the CLI tool much, but s3 filesystem is already supported > in Hudi. You may check the following class to see the list of file systems > already supported - > > https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/fs/StorageSchemes.java > . > > On Wed, Sep 9, 2020 at 6:46 AM Adam <[email protected]> wrote: > > > Hey guys, > > I'm trying to use the Hudi CLI to connect to tables stored on S3 using > the > > Glue metastore. Using a tip from Ashish M G > > < > > > https://apache-hudi.slack.com/archives/C4D716NPQ/p1599243415197500?thread_ts=1599242852.196900&cid=C4D716NPQ > > > > > on Slack, I added the dependencies, re-built and was able to use the > > connect command to connect to the table, albeit with warnings: > > > > hudi->connect --path s3a://bucketName/path.parquet > > > > 29597 [Spring Shell] INFO > > org.apache.hudi.common.table.HoodieTableMetaClient - Loading > > HoodieTableMetaClient from s3a://bucketName/path.parquet > > > > WARNING: An illegal reflective access operation has occurred > > > > WARNING: Illegal reflective access by > > org.apache.hadoop.security.authentication.util.KerberosUtil > > (file:/home/username/hudi-cli/target/lib/hadoop-auth-2.7.3.jar) to method > > sun.security.krb5.Config.getInstance() > > > > WARNING: Please consider reporting this to the maintainers of > > org.apache.hadoop.security.authentication.util.KerberosUtil > > > > WARNING: Use --illegal-access=warn to enable warnings of further illegal > > reflective access operations > > > > WARNING: All illegal access operations will be denied in a future release > > > > 29785 [Spring Shell] WARN org.apache.hadoop.util.NativeCodeLoader - > > Unable to load native-hadoop library for your platform... using > > builtin-java classes where applicable > > > > 31060 [Spring Shell] INFO org.apache.hudi.common.fs.FSUtils - Hadoop > > Configuration: fs.defaultFS: [file:///], Config:[Configuration: > > core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, > > yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml], > > FileSystem: [org.apache.hadoop.fs.s3a.S3AFileSystem@6b725a01] > > > > 31380 [Spring Shell] INFO org.apache.hudi.common.table.HoodieTableConfig > > - > > Loading table properties from > > s3a://bucketName/path.parquet/.hoodie/hoodie.properties > > > > 31455 [Spring Shell] INFO > > org.apache.hudi.common.table.HoodieTableMetaClient - Finished Loading > > Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from > > s3a://bucketName/path.parquet > > > > Metadata for table tablename loaded > > > > However, many of the other commands seem to not be working properly: > > > > hudi:tablename->savepoints show > > > > ╔═══════════════╗ > > > > ║ SavepointTime ║ > > > > ╠═══════════════╣ > > > > ║ (empty) ║ > > > > ╚═══════════════╝ > > > > hudi:tablename->savepoint create > > > > Commit null not found in Commits > > org.apache.hudi.common.table.timeline.HoodieDefaultTimeline: > > [20200724220817__commit__COMPLETED] > > > > > > hudi:tablename->stats filesizes > > > > > > > ╔════════════╤═══════╤═══════╤═══════╤═══════╤═══════╤═══════╤══════════╤════════╗ > > > > ║ CommitTime │ Min │ 10th │ 50th │ avg │ 95th │ Max │ NumFiles │ > > StdDev ║ > > > > > > > ╠════════════╪═══════╪═══════╪═══════╪═══════╪═══════╪═══════╪══════════╪════════╣ > > > > ║ ALL │ 0.0 B │ 0.0 B │ 0.0 B │ 0.0 B │ 0.0 B │ 0.0 B │ 0 │ > > 0.0 B ║ > > > > > > > ╚════════════╧═══════╧═══════╧═══════╧═══════╧═══════╧═══════╧══════════╧════════╝ > > > > > > hudi:tablename->show fsview all > > > > 171314 [Spring Shell] INFO > > org.apache.hudi.common.table.HoodieTableMetaClient - Loading > > HoodieTableMetaClient from s3a://bucketName/path.parquet > > > > 171362 [Spring Shell] INFO org.apache.hudi.common.fs.FSUtils - Hadoop > > Configuration: fs.defaultFS: [file:///], Config:[Configuration: > > core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, > > yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml], > > FileSystem: [org.apache.hadoop.fs.s3a.S3AFileSystem@6b725a01] > > > > 171666 [Spring Shell] INFO > > org.apache.hudi.common.table.HoodieTableConfig - > > Loading table properties from > > s3a://bucketName/path.parquet/.hoodie/hoodie.properties > > > > 171725 [Spring Shell] INFO > > org.apache.hudi.common.table.HoodieTableMetaClient - Finished Loading > > Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from > > s3a://bucketName/path.parquet > > > > 171725 [Spring Shell] INFO > > org.apache.hudi.common.table.HoodieTableMetaClient - Loading Active > commit > > timeline for s3a://bucketName/path.parquet > > > > 171817 [Spring Shell] INFO > > org.apache.hudi.common.table.timeline.HoodieActiveTimeline - Loaded > > instants [[20200724220817__clean__COMPLETED], > > [20200724220817__commit__COMPLETED]] > > > > 172262 [Spring Shell] INFO > > org.apache.hudi.common.table.view.AbstractTableFileSystemView - > > addFilesToView: NumFiles=0, NumFileGroups=0, FileGroupsCreationTime=5, > > StoreTimeTaken=2 > > > > > > > ╔═══════════╤════════╤══════════════╤═══════════╤════════════════╤═════════════════╤═══════════════════════╤═════════════╗ > > > > ║ Partition │ FileId │ Base-Instant │ Data-File │ Data-File Size │ Num > > Delta Files │ Total Delta File Size │ Delta Files ║ > > > > > > > ╠═══════════╧════════╧══════════════╧═══════════╧════════════════╧═════════════════╧═══════════════════════╧═════════════╣ > > > > ║ (empty) > > ║ > > > > > > > ╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝ > > > > I looked through the CLI code, and it seems that for true support we > would > > need to add support for the different storage options hdfs/s3/azure/etc. > in > > HoodieTableMetaClient. As from my understanding TableNotFoundException. > > checkTableValidity one of the first steps in this function checks just > the > > hdfs filesystem. > > > > Could someone please clarify if this is something already supported and > I'm > > just not configuring it correctly or if it's something that would need to > > be added and if the HoodieTableMetaClient is on the right track or not? > > > > Thanks, > > >
