Jeff, thanks for your reply. unfortunately, the website is in maintenance. The
reason I monitored the system calls of HDFS is to try to find out what
activities cause so much system CPU time. Other than writing to the disk and
sending and receiving packets, I cannot think of anything else that can
Hey Da,
You may have observed https://issues.apache.org/jira/browse/HDFS-1601.
Regards,
Jeff
On Fri, Jan 28, 2011 at 7:08 PM, Da Zheng wrote:
> Hello,
>
> I monitored system calls of HDFS with systemtap and found HDFS actually
> sends
> many 1-byte data to the network. I could also see many 8-
Thanks Aaron, it has to be click stream and the more the better.
Thanks everyone.
Bruce Williams
Concepts, like individuals, have their histories and are just as incapable
of withstanding the ravages of time as are individuals. But in and through
all this they retain a kind of homesickness for t
Start with the student's CS department's web server?
I believe the wikimedia foundation also makes the access logs to wikipedia
et al. available publicly. That is quite a lot of data though.
- Aaron
On Sun, Jan 30, 2011 at 10:54 AM, Bruce Williams
wrote:
> Does anyone know of a source of click s
Forgot to mention sheet music or tabs as another good source of sequence
data ;)
On Jan 30, 2011 2:45 PM, "brien colwell" wrote:
> You might consider starting with other sequence data like file bytes or
DNA.
> The main difference between those and click stream is how you model the
> steps.
> On Ja
You might consider starting with other sequence data like file bytes or DNA.
The main difference between those and click stream is how you model the
steps.
On Jan 30, 2011 1:55 PM, "Bruce Williams" wrote:
Does anyone know of a source of click stream data for a student research
project?
Bruce Williams
Concepts, like individuals, have their histories and are just as incapable
of withstanding the ravages of time as are individuals. But in and through
all this they retain a kind of homesickness for th
This is the specific information I referred to in my post.
http://hadoop.apache.org/common/docs/r0.20.0/fair_scheduler.html
mapred.fairscheduler.loadmanager An extensibility point that lets you
specify a class that determines how many maps and reduces can run on a given
TaskTracker. This class sh