it's possible that some of the commands are not erroring gracefully for
missing parameters?

hudi:tablename->savepoint create

for eg, would need a commit time for creating the savepoint,

if you are able to connect to the dataset, then it should all be working,

On Wed, Sep 9, 2020 at 3:27 AM Pratyaksh Sharma <[email protected]>
wrote:

> Hi Adam,
>
> I have not used the CLI tool much, but s3 filesystem is already supported
> in Hudi. You may check the following class to see the list of file systems
> already supported -
>
> https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/fs/StorageSchemes.java
> .
>
> On Wed, Sep 9, 2020 at 6:46 AM Adam <[email protected]> wrote:
>
> > Hey guys,
> > I'm trying to use the Hudi CLI to connect to tables stored on S3 using
> the
> > Glue metastore. Using a tip from Ashish M G
> > <
> >
> https://apache-hudi.slack.com/archives/C4D716NPQ/p1599243415197500?thread_ts=1599242852.196900&cid=C4D716NPQ
> > >
> > on Slack, I added the dependencies, re-built and was able to use the
> > connect command to connect to the table, albeit with warnings:
> >
> > hudi->connect --path s3a://bucketName/path.parquet
> >
> > 29597 [Spring Shell] INFO
> > org.apache.hudi.common.table.HoodieTableMetaClient  - Loading
> > HoodieTableMetaClient from s3a://bucketName/path.parquet
> >
> > WARNING: An illegal reflective access operation has occurred
> >
> > WARNING: Illegal reflective access by
> > org.apache.hadoop.security.authentication.util.KerberosUtil
> > (file:/home/username/hudi-cli/target/lib/hadoop-auth-2.7.3.jar) to method
> > sun.security.krb5.Config.getInstance()
> >
> > WARNING: Please consider reporting this to the maintainers of
> > org.apache.hadoop.security.authentication.util.KerberosUtil
> >
> > WARNING: Use --illegal-access=warn to enable warnings of further illegal
> > reflective access operations
> >
> > WARNING: All illegal access operations will be denied in a future release
> >
> > 29785 [Spring Shell] WARN  org.apache.hadoop.util.NativeCodeLoader  -
> > Unable to load native-hadoop library for your platform... using
> > builtin-java classes where applicable
> >
> > 31060 [Spring Shell] INFO  org.apache.hudi.common.fs.FSUtils  - Hadoop
> > Configuration: fs.defaultFS: [file:///], Config:[Configuration:
> > core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml,
> > yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml],
> > FileSystem: [org.apache.hadoop.fs.s3a.S3AFileSystem@6b725a01]
> >
> > 31380 [Spring Shell] INFO  org.apache.hudi.common.table.HoodieTableConfig
> > -
> > Loading table properties from
> > s3a://bucketName/path.parquet/.hoodie/hoodie.properties
> >
> > 31455 [Spring Shell] INFO
> > org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading
> > Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from
> > s3a://bucketName/path.parquet
> >
> > Metadata for table tablename loaded
> >
> > However, many of the other commands seem to not be working properly:
> >
> > hudi:tablename->savepoints show
> >
> > ╔═══════════════╗
> >
> > ║ SavepointTime ║
> >
> > ╠═══════════════╣
> >
> > ║ (empty)       ║
> >
> > ╚═══════════════╝
> >
> > hudi:tablename->savepoint create
> >
> > Commit null not found in Commits
> > org.apache.hudi.common.table.timeline.HoodieDefaultTimeline:
> > [20200724220817__commit__COMPLETED]
> >
> >
> > hudi:tablename->stats filesizes
> >
> >
> >
> ╔════════════╤═══════╤═══════╤═══════╤═══════╤═══════╤═══════╤══════════╤════════╗
> >
> > ║ CommitTime │ Min   │ 10th  │ 50th  │ avg   │ 95th  │ Max   │ NumFiles │
> > StdDev ║
> >
> >
> >
> ╠════════════╪═══════╪═══════╪═══════╪═══════╪═══════╪═══════╪══════════╪════════╣
> >
> > ║ ALL        │ 0.0 B │ 0.0 B │ 0.0 B │ 0.0 B │ 0.0 B │ 0.0 B │ 0        │
> > 0.0 B  ║
> >
> >
> >
> ╚════════════╧═══════╧═══════╧═══════╧═══════╧═══════╧═══════╧══════════╧════════╝
> >
> >
> > hudi:tablename->show fsview all
> >
> > 171314 [Spring Shell] INFO
> > org.apache.hudi.common.table.HoodieTableMetaClient  - Loading
> > HoodieTableMetaClient from s3a://bucketName/path.parquet
> >
> > 171362 [Spring Shell] INFO  org.apache.hudi.common.fs.FSUtils  - Hadoop
> > Configuration: fs.defaultFS: [file:///], Config:[Configuration:
> > core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml,
> > yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml],
> > FileSystem: [org.apache.hadoop.fs.s3a.S3AFileSystem@6b725a01]
> >
> > 171666 [Spring Shell] INFO
> > org.apache.hudi.common.table.HoodieTableConfig  -
> > Loading table properties from
> > s3a://bucketName/path.parquet/.hoodie/hoodie.properties
> >
> > 171725 [Spring Shell] INFO
> > org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading
> > Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from
> > s3a://bucketName/path.parquet
> >
> > 171725 [Spring Shell] INFO
> > org.apache.hudi.common.table.HoodieTableMetaClient  - Loading Active
> commit
> > timeline for s3a://bucketName/path.parquet
> >
> > 171817 [Spring Shell] INFO
> > org.apache.hudi.common.table.timeline.HoodieActiveTimeline  - Loaded
> > instants [[20200724220817__clean__COMPLETED],
> > [20200724220817__commit__COMPLETED]]
> >
> > 172262 [Spring Shell] INFO
> > org.apache.hudi.common.table.view.AbstractTableFileSystemView  -
> > addFilesToView: NumFiles=0, NumFileGroups=0, FileGroupsCreationTime=5,
> > StoreTimeTaken=2
> >
> >
> >
> ╔═══════════╤════════╤══════════════╤═══════════╤════════════════╤═════════════════╤═══════════════════════╤═════════════╗
> >
> > ║ Partition │ FileId │ Base-Instant │ Data-File │ Data-File Size │ Num
> > Delta Files │ Total Delta File Size │ Delta Files ║
> >
> >
> >
> ╠═══════════╧════════╧══════════════╧═══════════╧════════════════╧═════════════════╧═══════════════════════╧═════════════╣
> >
> > ║ (empty)
> >                                               ║
> >
> >
> >
> ╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
> >
> > I looked through the CLI code, and it seems that for true support we
> would
> > need to add support for the different storage options hdfs/s3/azure/etc.
> in
> > HoodieTableMetaClient. As from my understanding TableNotFoundException.
> > checkTableValidity one of the first steps in this function checks just
> the
> > hdfs filesystem.
> >
> > Could someone please clarify if this is something already supported and
> I'm
> > just not configuring it correctly or if it's something that would need to
> > be added and if the HoodieTableMetaClient is on the right track or not?
> >
> > Thanks,
> >
>

Reply via email to