Re: Documentation on device-mapper and friends
You should take this to the device mapper list, but I'll try here. For lurkers, this drawing may be helpful: http://www.thomas-krenn.com/en/oss/linux-io-stack-diagram/linux-io-stack-diagram_v1.0.pdf On Wed, May 8, 2013 at 11:15 AM, neha naik wrote: > Hi Greg, > Thanks for the information. I have another question :). > Is there some less flexibility if we use device mapper target? > For example in block device driver you can use the api such that it won't > use the OS io scheduler, so the io comes directly to the block device driver > through the 'make_request' call. With the device mapper i don't think that > happens(looking at the api calls). I believe that is correct and one of the reasons DRBD is not part of DM. DRBD has various modes including those where it guarantees the I/Os on both the main target and on the replicated target hit in the exact same order. I don't believe that DM can be used to totally control disk I/O. It is meant to be stackable, so I think you lose some exact control. > Does this mean that stuff like io > scheduling. barrier control etc is done > by the device mapper itself and we can focus only on 'mapping' the io. As shown in the diagram linked above, DM sits above the i/o schedulers so you should not have to worry about it. If you want to play with schedulers, I think that should be done outside of DM. Thus I believe you can ignore barriers/scheduling UNLESS you create a target that needs special barrier/scheduling control. The obvious example of something needing that is raid5. If you get a barrier that forces data out to a single disk in a raid, you MUST ensure that the raid checksum is calculated and written out prior to calling the barrier complete. That is going to take special handling no matter what you do. It's been a couple years since I dug into raid 5/6 as relates to barriers, but it used to be that code simply didn't do the right thing in mdraid and DM did not support raid 5/6, so yes the coders could ignore it, but they created broken logic when they did. Greg ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Couple of questions on OOM trace.
Here is the OOM trace header : Out of memory: Kill process 5374 (min_free_kbytes) score 944 or sacrifice child Killed process 5374 (min_free_kbytes) total-vm:30495360kB,anon-rss:20155328kB, file-rss:64kB min_free_kbytes invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 min_free_kbytes cpuset=/ mems_allowed=0 -- I understand that in this case free pages has gone below min_pages(1). Couple of questions based on these 4 lines : (1) I googled a lot but I am not able to find the meaning of 'score 994' , anon-rss and file-rss ? (2) In my understanding gfp_mask is relevant here to know from which Zone the memory allocation was tried , but failed - right ? (3) What do order and oom_score_adj signify here ? Regards, Shraddha (1) http://dd.qc.ca/people.redhat.com/kernel/min_free_kbytes.html ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Documentation on device-mapper and friends
Hi Greg, Thanks for the information. I have another question :). Is there some less flexibility if we use device mapper target? For example in block device driver you can use the api such that it won't use the OS io scheduler, so the io comes directly to the block device driver through the 'make_request' call. With the device mapper i don't think that happens(looking at the api calls). Does this mean that stuff like io scheduling. barrier control etc is done by the device mapper itself and we can focus only on 'mapping' the io. Regards, Neha On Wed, May 8, 2013 at 8:11 AM, Greg Freemyer wrote: > The block layers can be layered both ways. DM is the newer > infrastructure and was created in the early days of 2.6 > > If what I was writing could fit into a dm-target, that is what I would do. > > There are significant projects like drbd and mdraid that are not > dm-targets, but I think their is a long term goal to incorporate > mdraid's functionality at a minimum into dm. I doubt drbd is ever > moved to dm. It is just too big of a project and in use in lots of > production server environments. > > Greg > > On Tue, May 7, 2013 at 1:46 AM, Gaurav Mahajan > wrote: > > Hi Neha, > > > > LVM uses device mapper. Advantages of using device mapper is that you can > > stack different dm-targets on each other. > > I am really not aware of block device drivers. > > > > May be Greg can help us understand the actual pros and cons. > > > > Thanks, > > Gaurav > > > > > > On Wed, May 1, 2013 at 9:45 PM, neha naik wrote: > >> > >> Hi Gaurav, > >> I went through your blog and it is really informative. But after > reading > >> that i realized that i have a question: > >> If I want to write a block device driver which is going to sit on lvm > >> (and do some functionality on top of it) then should i go for the block > >> device driver api > >> or write it as a device mapper target. What are the > >> advantages/disadvantages of both the approaches. > >> > >> Regards, > >> Neha > >> > >> > >> On Tue, Apr 30, 2013 at 4:24 AM, Gaurav Mahajan > >> wrote: > >>> > >>> Hi Amit, > >>> > >>> I had compiled some notes on my blog. > >>> Here are some links on writing your own device mapper target. > >>> > http://techgmm.blogspot.in/p/writing-your-own-device-mapper-target.html > >>> > >>> Concept of device mapper target. > >>> http://techgmm.blogspot.in/p/device-mapper-layer-explored-every.html > >>> > >>> Thanks, > >>> Gaurav. > >>> > >>> > >>> On Tue, Apr 30, 2013 at 5:05 AM, Anatol Pomozov > >>> wrote: > > Hi > > On Mon, Apr 29, 2013 at 9:51 AM, amit mehta > wrote: > > On Sun, Apr 28, 2013 at 5:24 PM, Greg Freemyer > > wrote: > >> A nice diagram of the overall storage subsystem is at > >> http://www.thomas-krenn.com/en/oss/linux-io-stack-diagram.html > >> > >> Dm is just a single block in it, but it can help to see where it > fits > >> in overall. > >> > >> Btw: that diagram doesn't show the legacy ata driver that creates > >> /dev/hdx style devices. Has that been dropped while I wasn't > paying > >> attention? I haven't used it in years, but I thought it was still > used on > >> embedded systems. > >> > > > > Thank you for sharing the link, but I'm looking for more > > detailed information on I/O stack in Linux, dm-mapper and > > multipath in particular. > > Some docs about multipath can be found here > > http://www.sourceware.org/lvm2/wiki/MultipathUsageGuide > http://christophe.varoqui.free.fr/refbook.html > > The userspace part for tools is here > http://sourceware.org/lvm2/ > > ___ > Kernelnewbies mailing list > Kernelnewbies@kernelnewbies.org > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > >>> > >>> > >>> > >>> ___ > >>> Kernelnewbies mailing list > >>> Kernelnewbies@kernelnewbies.org > >>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > >>> > >> > > > ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Documentation on device-mapper and friends
The block layers can be layered both ways. DM is the newer infrastructure and was created in the early days of 2.6 If what I was writing could fit into a dm-target, that is what I would do. There are significant projects like drbd and mdraid that are not dm-targets, but I think their is a long term goal to incorporate mdraid's functionality at a minimum into dm. I doubt drbd is ever moved to dm. It is just too big of a project and in use in lots of production server environments. Greg On Tue, May 7, 2013 at 1:46 AM, Gaurav Mahajan wrote: > Hi Neha, > > LVM uses device mapper. Advantages of using device mapper is that you can > stack different dm-targets on each other. > I am really not aware of block device drivers. > > May be Greg can help us understand the actual pros and cons. > > Thanks, > Gaurav > > > On Wed, May 1, 2013 at 9:45 PM, neha naik wrote: >> >> Hi Gaurav, >> I went through your blog and it is really informative. But after reading >> that i realized that i have a question: >> If I want to write a block device driver which is going to sit on lvm >> (and do some functionality on top of it) then should i go for the block >> device driver api >> or write it as a device mapper target. What are the >> advantages/disadvantages of both the approaches. >> >> Regards, >> Neha >> >> >> On Tue, Apr 30, 2013 at 4:24 AM, Gaurav Mahajan >> wrote: >>> >>> Hi Amit, >>> >>> I had compiled some notes on my blog. >>> Here are some links on writing your own device mapper target. >>> http://techgmm.blogspot.in/p/writing-your-own-device-mapper-target.html >>> >>> Concept of device mapper target. >>> http://techgmm.blogspot.in/p/device-mapper-layer-explored-every.html >>> >>> Thanks, >>> Gaurav. >>> >>> >>> On Tue, Apr 30, 2013 at 5:05 AM, Anatol Pomozov >>> wrote: Hi On Mon, Apr 29, 2013 at 9:51 AM, amit mehta wrote: > On Sun, Apr 28, 2013 at 5:24 PM, Greg Freemyer > wrote: >> A nice diagram of the overall storage subsystem is at >> http://www.thomas-krenn.com/en/oss/linux-io-stack-diagram.html >> >> Dm is just a single block in it, but it can help to see where it fits >> in overall. >> >> Btw: that diagram doesn't show the legacy ata driver that creates >> /dev/hdx style devices. Has that been dropped while I wasn't paying >> attention? I haven't used it in years, but I thought it was still used >> on >> embedded systems. >> > > Thank you for sharing the link, but I'm looking for more > detailed information on I/O stack in Linux, dm-mapper and > multipath in particular. Some docs about multipath can be found here http://www.sourceware.org/lvm2/wiki/MultipathUsageGuide http://christophe.varoqui.free.fr/refbook.html The userspace part for tools is here http://sourceware.org/lvm2/ ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies >>> >>> >>> >>> ___ >>> Kernelnewbies mailing list >>> Kernelnewbies@kernelnewbies.org >>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies >>> >> > ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: current tty
On Wed, May 8, 2013 at 6:21 AM, Hatte John wrote: > Hi: > >As I know .the /dev/tty is the current task's tty , which is stored to > current->tty , >My question is when does this value is assgined to current->tty ? As I told you on IRC, struct task_struct has no ->tty member. If your custom/old/whatever kernel has, use grep to find out... Thanks, //richard ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
unable to set smp_affinity of PCIe interrupt
Hello, Following is the output of cat /proc/interrupts CPU0 CPU1 CPU2 200: 9634 0 0 PCIe0-MSI eth0 I want to change the processor affinity of the above PCIe interrupt By executing command echo 2 > /proc/irq/200/smp_affinity I get the following error sh: write error: Input/output error Even after changing the file permissions I have the same problem. I would like to know how to change the affinity of the above virtual interrupt?? Regards, Amit. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: current tty
It is not the answer I wanted , I am saking in term of coding . 2013/5/8 Jun Hu > you can get it by "tty" command, like the following: > > junwork:/proc/self/task/1874 # tty > /dev/pts/0 > > > - > Original Message- > -- > - > Jun Hu > > DSE In Suse China. > -- > > > > To: kernelnewbies@kernelnewbies.org > Subject: current tty > Date: Wed, 8 May 2013 12:21:32 +0800 > > Hi: > >As I know .the /dev/tty is the current task's tty , which is stored > to current->tty , >My question is when does this value is assgined to current->tty ? > > > Thanks! > ___ > Kernelnewbies mailing list > Kernelnewbies@kernelnewbies.org > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > > > ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Analyzing Kernel call traces.
Any good tutorial for analyzing kernel call traces ? I want to know what is the meaning of everything that appears in the call trace and get to the exact cause of the problem. Thanks for the noble cause of sharing knowledge. Regards, Shraddha ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
RE: read PCI memory and config spyce through /dev/mem
Jun Hu wrote: > strace your application, like as: > ... It's the lseek that fails: Obviously, /dev/mem is not seekable. What really puzzles me is that dd _does_ work on /dev/mem when looking at ordinary RAM, but I'm reluctant to further debug why. But anyhow, as I think it may at times be also useful for others to have a simple tool that allows to read (and write!) PCI memory, I share the few lines of code that I wrote to fulfill my needs. I hope that the included "documentation" is verbose enough. And I'm certainly interested if anyone finds bugs :-): #include #include #include #include #include #include #include void usage(const char *name) { fprintf(stderr, "Usage: %s address access [size]\033[2m\n", name); fprintf(stderr, " Either - prints \"size\" bytes \033[1mfrom\033[2m physical memory,\n"); fprintf(stderr, "starting at \"address\" and using \"access\"\n"); fprintf(stderr, "bytes per access \033[1mto\033[2m stdout\n"); fprintf(stderr, " or - writes as many bytes as are available \033\[1mfrom\033[2m\n"); fprintf(stderr, "stdin \033[1mto\033[2m physical memory starting at \"address\"\n"); fprintf(stderr, "using \"access\" bytes per access.\n"); fprintf(stderr, " Note that both \"address\" and either \"size\" or the\n"); fprintf(stderr, " number of available bytes from stdin must be a multiple of\n"); fprintf(stderr, " \"access\", and \"access\" must either be %d, %d, %d or %d.\033[0m\n", sizeof(uint8_t), sizeof(uint16_t), sizeof(uint32_t), sizeof(uint64_t)); exit(-1); } template void readOrWrite(uint8_t *src, uint8_t *mem) { if(src) *(T *) mem = *(T *) src; else { T t = *(T *) mem; assert(write(1, &t, sizeof(T)) == sizeof(T)); } } int main( int argc, char *argv[]) { int fd, pageSize = getpagesize(), size = 0, n; uint8_t *mem, *b = 0, *buffer = 0; // Paranoia check to ensure that page size is a power of 2. assert((pageSize != 0) && !(pageSize & (pageSize - 1))); if (argc == 3) do { #define CHUNK_SIZE 1000 assert(b = buffer = (uint8_t *) realloc(buffer, size + CHUNK_SIZE)); assert((n = read(0, buffer + size, CHUNK_SIZE)) >= 0); size += n; } while(n); else if(argc == 4) size = strtoul(argv[3], 0, 0); else usage(argv[0]); off_t address = strtoul(argv[1], 0, 0); size_t access = strtoul(argv[2], 0, 0); if(address % access || size % access) usage(argv[0]); assert((fd = open ("/dev/mem", O_RDWR)) >= 0); size_t s = (((address & (pageSize - 1)) + size - 1) / pageSize + 1) * pageSize; mem = (uint8_t *) mmap(NULL, s, PROT_READ | PROT_WRITE, MAP_SHARED, fd, address & ~(pageSize - 1)); assert(mem != MAP_FAILED); for(uint8_t *i = mem; i < mem + size; i += access) { #define ROW(t) sizeof(t): readOrWrite(b, i); break switch(access) { case ROW(uint8_t); case ROW(uint16_t); case ROW(uint32_t); case ROW(uint64_t); default: usage(argv[0]); } if(b) b += access; } free(buffer); munmap(mem, getpagesize()); close(fd); return 0; } ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies