Re: [zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.

2010-04-16 Thread m...@bruningsystems.com

Hi Fred,

Have you read the ZFS On Disk Format Specification paper
at: 
http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf?




Ifred pam wrote:

Hi Richard, thanks for your time, I really appreciate it, but I'm still unclear 
on how this works.

So uberblocks point to the MOS. Why do you then require multiple uberblocks? Or are there actually multiple MOS'es? 
Or is there one MOS and multiple delta's to it (and its predecessors) and do the uberblocks then point to the latest delta?

In the latter case I can understand why Nullifying the latest uberblocks reverts to a previous 
situation, otherwise I don't see the difference between "Nullifying the first uberblocks" 
and "Nullifying the last uberblocks".
  
One reason for multiple uberblocks is that uberblocks, like everything 
else, are copy-on-write.
The reason you have 4 copies (2 labels at front and 2 labels at the end 
of every disk) is
redundancy.  No, there are not multiple MOS'es  in one pool (though 
there may be multiple copies
of the MOS via "ditto" blocks).  The current (or "active") uberblock is 
the one with the highest
transaction id with valid checksum.  Transaction ids are basically 
monotonically increasing,

so nullifying the last uberblock can revert you to a previous state.

max

Thanks, Fred
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Tuning the ARC towards LRU

2010-04-05 Thread m...@bruningsystems.com

Hi,

In simple terms, the ARC is divided into a MRU and MFU side.
   target size (c) = target MRU size (p) + target MFU size (c-p)

On Solaris, to get from the MRU to the MFU side, the block must be
read at least once in 62.5 milliseconds.  For pure read-once workloads,
the data won't to the MFU side and the ARC will behave exactly like an
(adaptable) MRU cache.


Richard,
I am looking at the code that moves a buffer from MRU to MFU,
and as I read it, if the block is read and the time is greater than
62 milliseconds, it moves from MRU to MFU (lines ~2256 to ~2265
in arc.c).  Also, I have a program that reads the same block once every
5 seconds, and on a relatively idle machine, I can find the block in the
MFU, not the MRU (using mdb).  If the block is read again in less than 
62 milliseconds,
it stays in the MRU. 


max

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS hex dump diagrams?

2010-03-26 Thread m...@bruningsystems.com

Hi Richard,

Richard Elling wrote:

On Mar 25, 2010, at 2:45 PM, John Bonomi wrote:

  

I'm sorry if this is not the appropriate place to ask, but I'm a student and 
for an assignment I need to be able to show at the hex level how files and 
their attributes are stored and referenced in ZFS. Are there any resources 
available that will show me how this is done?



IMHO the best place to start with this level of analysis is the ZFS on-disk
specification doc:
http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf

It is getting long in the tooth and doesn't document recent features, but
it is fundamentally correct.
  

I completely agree with this, but good luck getting a hex dump from that
information.

max


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS hex dump diagrams?

2010-03-26 Thread m...@bruningsystems.com

Hi,

You might take a look at
http://www.osdevcon.org/2008/files/osdevcon2008-max.pdf
and
http://www.osdevcon.org/2008/files/osdevcon2008-proceedings.pdf, starting
at page 36.

Or you might just use "od -x file" for the file part of your assignment.

Have fun.
max


Eric D. Mudama wrote:

On Fri, Mar 26 at 11:10, Sanjeev wrote:

On Thu, Mar 25, 2010 at 02:45:12PM -0700, John Bonomi wrote:

I'm sorry if this is not the appropriate place to ask, but I'm a
student and for an assignment I need to be able to show at the hex
level how files and their attributes are stored and referenced in
ZFS. Are there any resources available that will show me how this
is done?


You could try zdb.


Or just look at the source code.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL corrupt, not recoverable even with logfix

2009-12-10 Thread m...@bruningsystems.com

Hi James,
I just spent about a week recovering about 10TB of file data
for someone who encountered a (somewhat?) similar problem to what you 
are seeing.

If you are still having problems with this, please contact me off-list.

Regards,
max

James Risner wrote:

It was created on AMD64 FreeBSD with 8.0RC2 (which was version 13 of ZFS iirc.)

At some point I knocked it out (export) somehow, I don't remember doing so 
intentionally.  So I can't do commands like zpool replace since there are no 
pools.

It says it was last used by the FreeBSD box, but the FreeBSD does not show it  with 
"zpool status" command.

I'm going down tomorrow to work on it again, and I'm going to try 8.0 Release 
AMD64 FreeBSD (I've already tried i386 AMD64 FreeBSD 8.0 Release) and 
Opensolaris dev-127.

I was just hoping there was some way I'm missing to mount it read only (I have tried 
"zpool import -f -o readonly=yes" but that doesn't work either.)
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zfs improvements to compression in Solaris 10?

2009-08-04 Thread m...@bruningsystems.com

Hi Bob,

Bob Friesenhahn wrote:

On Tue, 4 Aug 2009, Prabahar Jeyaram wrote:


You seem to be hitting :

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6586537

The fix is available in OpenSolaris build 115 and later not for 
Solaris 10 yet.


It is interesting that this is a simple thread priority issue.  The 
system has a ton of available CPU but the higher priority compression 
thread seems to cause scheduling lockout.  The Perfmeter tool shows 
that compression is a very short-term spike in CPU. Of course since 
Perfmeter and other apps stop running, it might be missing some sample 
data.


I could put the X11 server into the real-time scheduling class but 
hate to think about what would happen as soon as Firefox visits a web 
site. :-)
I can understand wanting to be careful about running Xorg real time, but 
what does this have to

do with firefox?  Firefox will still run in the interactive class.

max

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best ways to contribute WAS: Fed up with ZFS causing data loss

2009-07-31 Thread m...@bruningsystems.com

Hi Ross,

Ross wrote:
#3 zfs unlike other things like the build system are extremely well 
documented.  There are books on it, code to read and even instructors 
(Max Bruning) who can teach you about the internals.  My project even 
rganized a free online training for this



Again, brilliant if you're a programmer.

  
I think it is a misconception that a course about internals is meant 
only for programmers.
An internals course should teach how the system works.  If you are a 
programmer, this
should help you to do programming on the system.  If you are an admin, 
it should help
you in your admin work by giving you a better understanding of what the 
system is doing.
If you are a user, it should help you to make better use of the 
system.  In short, I think anyone

who is working with Solaris/OpenSolaris can benefit.

max

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] raidz on-disk layout

2009-04-09 Thread m...@bruningsystems.com

Hi,

For anyone interested, I have blogged about raidz on-disk layout at:
http://mbruning.blogspot.com/2009/04/raidz-on-disk-format.html

Comments/corrections are welcome.

thanks,
max

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Real Swap space & vmstat output

2009-03-07 Thread m...@bruningsystems.com

Hi,
iman habibi wrote:

Hi to All

I defined 2G for swap space in installation solaris 10 with zfs file 
system,,but when i run vmstat it shows about 3G is free!.why this 
happend?!

My Real memory is 4G.
Can anybody guide this,,im confused!


# vmstat 10
 kthr  memorypagedisk  faults  cpu
 r b w   swap  free  re  mf pi po fr de sr s6 sd sd sd   in   sy   cs 
us sy id
 0 0 0 3875248 2385176 0  1  0  0  0  0  0 -0 -0 -0  2 1764 38933 
10834 29 36 36
 1 0 0 3590808 2091536 1  2  0  0  0  0  0  0  0  0  0 2316 57731 
14530 43 53 4
 0 0 0 3590824 2091616 0  0  0  0  0  0  0  0  0  0  2 2176 57499 
14528 43 52 5
 0 0 0 3590816 2091608 0  0  0  0  0  0  0  0  0  0  0 2417 57756 
14481 43 52 5
 1 0 0 3590808 2091600 0  0  0  0  0  0  0  0  0  0  0 2525 57154 
14249 44 53 3
 0 0 0 3590808 2091600 0  0  0  0  0  0  0  0  0  0  2 2964 57376 
14483 43 53 4

^C
# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool68G  6.67G  61.3G 9%  ONLINE  -
# zfs list
NAMEUSED  AVAIL  REFER  MOUNTPOINT
rpool  8.67G  58.3G94K  /rpool
rpool/ROOT 4.61G  58.3G18K  legacy
rpool/ROOT/s10s_u6wos_07b  4.61G  58.3G  4.38G  /
rpool/ROOT/s10s_u6wos_07b/var   230M  58.3G   230M  /var
rpool/dump 1.00G  58.3G  1.00G  -
rpool/export   1.06G  58.3G20K  /export
rpool/export/home  1.06G  58.3G  1.06G  /export/home
rpool/swap2G  60.3G16K  -

A short answer is that the swap column from vmstat is showing the amount of
free space usable for anonymous memory.  This includes free space on disk
plus usable space in memory.  You should also take a look at "swap -lh" and
"swap -sh" output.
max



Regards


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss