Re: Kudu on top of Alluxio
+1 thanks for adding that Todd. Mike On Mon, Mar 27, 2017 at 9:55 AM, Todd Lipconwrote: > On Sat, Mar 25, 2017 at 2:54 PM, Mike Percy wrote: > >> Kudu currently relies on local storage on a POSIX file system. Right now >> there is no support for S3, which would be interesting but is non-trivial >> in certain ways (particularly if we wanted to rely on S3's replication and >> disable Kudu's app-level replication). >> >> I would suggest using only either EXT4 or XFS file systems for production >> deployments as of Kudu 1.3, in a JBOD configuration, with one SSD per >> machine for the WAL and with the data disks on either SATA or SSD drives >> depending on the workload. Anything else is untested AFAIK. >> > > I would amend this and say that SSD for the WAL is nice to have, but not a > requirement. We do lots of testing on non-SSD test clusters and I'm aware > of many production clusters which also do not have SSD. > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera >
Re: Kudu on top of Alluxio
On Sat, Mar 25, 2017 at 2:54 PM, Mike Percywrote: > Kudu currently relies on local storage on a POSIX file system. Right now > there is no support for S3, which would be interesting but is non-trivial > in certain ways (particularly if we wanted to rely on S3's replication and > disable Kudu's app-level replication). > > I would suggest using only either EXT4 or XFS file systems for production > deployments as of Kudu 1.3, in a JBOD configuration, with one SSD per > machine for the WAL and with the data disks on either SATA or SSD drives > depending on the workload. Anything else is untested AFAIK. > I would amend this and say that SSD for the WAL is nice to have, but not a requirement. We do lots of testing on non-SSD test clusters and I'm aware of many production clusters which also do not have SSD. -Todd -- Todd Lipcon Software Engineer, Cloudera
Re: Kudu on top of Alluxio
Yeah. I think the reason HBase can pretty easily use something like Alluxio or S3 and Kudu can't as easily do it is because HBase already relied on external storage (HDFS) for replication so substituting another storage system with similar properties doesn't really amount to an architectural change for them. Mike Sent from my iPhone > On Mar 25, 2017, at 3:43 PM, Benjamin Kimwrote: > > Mike, > > Thanks for the informative answer. I asked this question because I saw that > Alluxio can be used to handle storage for HBase. Plus, we could keep our > cluster size to a minimum and not need to add more nodes based on storage > capacity. We would only need to size our clusters based on load (cores, > memory, bandwidth) instead. > > Cheers, > Ben > > >> On Mar 25, 2017, at 2:54 PM, Mike Percy wrote: >> >> Kudu currently relies on local storage on a POSIX file system. Right now >> there is no support for S3, which would be interesting but is non-trivial in >> certain ways (particularly if we wanted to rely on S3's replication and >> disable Kudu's app-level replication). >> >> I would suggest using only either EXT4 or XFS file systems for production >> deployments as of Kudu 1.3, in a JBOD configuration, with one SSD per >> machine for the WAL and with the data disks on either SATA or SSD drives >> depending on the workload. Anything else is untested AFAIK. >> >> As for Alluxio, I haven't heard of people using it for permanent storage and >> since Kudu has its own block cache I don't think it would really help with >> caching. Also I don't recall Tachyon providing POSIX semantics. >> >> Mike >> >> Sent from my iPhone >> >>> On Mar 25, 2017, at 9:50 AM, Benjamin Kim wrote: >>> >>> Hi, >>> >>> Does anyone know of a way to use AWS S3 or >> >
Re: Kudu on top of Alluxio
Mike, Thanks for the informative answer. I asked this question because I saw that Alluxio can be used to handle storage for HBase. Plus, we could keep our cluster size to a minimum and not need to add more nodes based on storage capacity. We would only need to size our clusters based on load (cores, memory, bandwidth) instead. Cheers, Ben > On Mar 25, 2017, at 2:54 PM, Mike Percywrote: > > Kudu currently relies on local storage on a POSIX file system. Right now > there is no support for S3, which would be interesting but is non-trivial in > certain ways (particularly if we wanted to rely on S3's replication and > disable Kudu's app-level replication). > > I would suggest using only either EXT4 or XFS file systems for production > deployments as of Kudu 1.3, in a JBOD configuration, with one SSD per machine > for the WAL and with the data disks on either SATA or SSD drives depending on > the workload. Anything else is untested AFAIK. > > As for Alluxio, I haven't heard of people using it for permanent storage and > since Kudu has its own block cache I don't think it would really help with > caching. Also I don't recall Tachyon providing POSIX semantics. > > Mike > > Sent from my iPhone > >> On Mar 25, 2017, at 9:50 AM, Benjamin Kim wrote: >> >> Hi, >> >> Does anyone know of a way to use AWS S3 or >
Kudu on top of Alluxio
Hi, Does anyone know of a way to use AWS S3 or