I've been testing a 3 node cluster (Side A, Side B and a Tiebreaker node) for resilience. When I take either side out (shared SAN disk access on 2 sides) the filesystem on the other node hangs for about 90 seconds before letting IO resume. While it is hanging, I am getting the following messages fro ...
.ENV: 1-node GPFS cluster with Oracle 10g RAC, AIX 5.3 TL8, GPFS 3.1, p595, 32 CPU, 112GB, DS8300..The following are some GPFS parameters: . pagepool 7GmaxMBpS 2048maxFilesToCache 100000prefetchThreads 100 worker1Threads 449 sharedMemLimit 1536MmaxStatCache 400000 . The above customer recently has a ...
.The following is output of mmfsadm dump thread:.===root@bgcgcdb1 /> mmfsadm dump thread|pgDump of thread system: TH at 0x110000C70Running thread 0x113206CB0 (mmfsadm dump command)nThreads 648 sequenceNum 434264965Thread pool utilization (current/highest/maximum):ThPoolNonCritical: 14/32/32ThPool ...
I would like to run "mmfsadm dump waiters" from a script on a regular basis for monitoring GPFS performance.Is it safe to run "mmfsadm dump waiters" on a live GPFS file system every five minutes?Would "mmfsadm saferdump waiters" be a better choice? Why? I can't find an explanation of the difference ...
HiI launched mmshutdown to restart the deamnon on a server.The down process hang with this messageMon May 5 16:36:19.730 2008: GPFS: 6027-338 Waiting for 1 user(s) ofshared segment to release it.I logged out all potential users with open files and shutdown the daemon on all the other servers. But I ...
Gurus,I am having a problem in a package that I am having. I have created this package but when I execute it I get this errorjava.sql.BatchUpdateException: ORA-06550: line 1, column 7:PLS-00201: identifier 'SP_INSERT_FS_DUMP' must be declaredORA-06550: line 1, column 7:PL/SQL: Statement ignoredI am ...
DearI have the following test setup:GPFS test FS that consists of two luns: Lun X - Storage box site 1 - NSD gpfsnsd1 Lun Y - Storage box site 2 - NSD gpfsnsd2 1 common SAN environnmentThere are 3 sites Site 1 which contains Quorum-Manager Nodes A / B Site 2 which contains Quorum-Manager Nodes C / D ...
Hallo All,we are now in an evaluation of SOFS with the underlying gpfs 3.2.1.6 Filesystem. We had trouble with an change in the area of define new disk to the gpfs-filesystem. The config that we had was one filesystem with 20 nsd's splitted over two location in one failuregroup. Now we added new dis ...
We are having a GPFS cluster running fine for the past one year. We have a new problem now. The FS is very slow and we saw that there were waiters waiting for as long as 22sec. Please see the messages below:GPFS 3.2.1-8 (upgraded to 3.2.1-8 about 3 weeks ago)RHEL 5.2 x86_64mmfsadm dump waiter report ...
Lookin for to do improvements at GPFS I/O, I have seen that on the GPFS nodes :root@gcsic099wn ~# mmfsadm dump thread Dump of thread system: TH at 0x406D32C0Running thread 0x2B050C015020 (mmfsadm dump command)nThreads 139 sequenceNum 12587394Thread pool utilization (current/highest/maximum):ThPoolNo ...
a few days ago we started to have a recurrent problem on a GPFS node. It belongs to a slave cluster and mounts a couple of file systems from two master clusters. GPFS version is 3.2.1-15.Sometimes the GPFS file systems on that node become unresponsive, and we find a load of 300 on the machine.The cl ...
I'm testing GPFS for storing many-many small files (> 100.000). I set PagePool=4096M, maxFilesToCache=100000(maximum) and maxStatCache 400000. All cache size = 2Gb during the stress test(I saw for the mmfsadm dump | grep cache |less values). But I have servers with 16Gb Ram and I want use all thi ...
Recently we experienced file system locked up with a lots of quota related waiters on the FS manager node. Rebooting the FS manager had only released the lock-up for few seconds before waiters started to accumulate again on the new FS manager.This problem seems to occur after we did the following ch ...
Test Scenario1. EnvironmentAIX 2 Nodes + GPFS 3.4.0.7 Lastest2. Create NSDs as belowgpfs1nsd:::dataAndMetadata:100::systemgpfs2nsd:::dataAndMetadata:100::systemgpfs5nsd:::descOnly:300::systemgpfs3nsd:::dataAndMetadata:200::systemgpfs4nsd:::dataAndMetadata:200::system3. Create 1 FS with replica enabl ...
On linux RH ES3 and GPFS 2.2.1.5 the nfsd's on all nfs servers go into DW state. One or more machines return the following:mmfsadm dump waiters0x8438168 waiting 323.422853000 seconds, SharedHashTab fetch handler: on ThCond 0xF8E77B0C (0xF8E77B0C) (LkObj), reason 'change_lock_shark waiting to set acq ...
after applying several updates I've noticed that "maxFeatureLevelAllowed" as I see it from "mmlsconfig" doesn't correspond to the current version of the software (reported by "mmfsadm dump config"): mmfsadm dump config ...my comm version 915compatible version 901max compatible version 99999max featu ...
Hallo,gpfs Build branch "3.2.1.2 "I'noticing sometimes a long delay before getting ls output 1-2 minutes.This happens on all gpfs clients. After the first directory reading, next readings are fast (I suspect because of cacheing) on the same client but the same problem happens on the other (idle) cli ...
We recently had a disk controller failure on a NSD server serving one of the disks on our filesystem (Linux, gpfs v. 3.2.1-3). the computer was unresponsive and hosed, so I powered it off. Then I suspended the disk in the filesystem and ran 'mmrestripefs $fsname -r' and that completed just fine. Nex ...
We have a host that we are trying to add as a a client but it is in continuous arbitrating state. Although we have been able to add it before, but it looks like something broke and we could not repeat it. We tried deleting it and re-adding it but that did not help. Node number Node name Quorum Nodes ...
Currently we have a customer running gpfs 2.3 on VSD (32 nodes, average 60+ user per filesystem) on AIX 5.2. One a particular filesystem, customer notice that some of the files are truncated, i.e. the file size is big but the can only read partial of the file. When do vi the file and save, the size ...
.1. How can we know how much pagepool we are using (the utilization) now?(mmfsadm dump <something>?).2. We set pagepool size to 10GB, but there is a message in mmfs.log that:.Thu Oct 8 09:17:37.391 2009: Pagepool has size 7864320K bytes instead of the requested 10485760K bytes..From GPFS 3.2.1 ...
I'm looking quite a longe time for a reason that one of my Node does not mount the filesystem.Now I had to reboot my second node and this is not mounting either. mmlscluster *GPFS cluster information* ======================== GPFS cluster name: GPFS_01.proxsys-net.intern GPFS cluster id: 72340254545 ...
Hallo All,we had today some tests on our SOFS-Cluster an we see after a power down of one node(SOF00003), an deadman switch timer event on node SOM00001.These Node is the primary configuration server in our cluster.The Messag was "SOM00001 kernel: http://772180.917028 GPFS Deadman Switch timer [0] h ...
on a filesystem mounted remotely on a different cluster (on the same LAN) we have observed two times the following problem: mmfsadm dump waiters showed long (thousands of seconds) waiters on the cluster owning the filesystem. The problem was traced down to a node which was waiting for a pending mess ...
We are occasionally seeing a hung filesystem for several minutes when attempting to take a snapshot. In some cases, the delay is over 10 minutes and we end up restarting GPFS on all nodes to recover. This is GPFS 3.2.1.19 with kernel 2.6.18-164.6.1. I've attached the output of "mmfsadm dump all" dur ...
it is a two-aix node gpfs cluster, gpfs version:3.3.0.6aix version:5300-09-01-0847when domain application start running, the gpfs will have a heavy I/O waitter, but the nsd disk is not very busy, and cpu has about 50% wait.I have tried to turn some I/O relevant parameter, but problem is still exist. ...
I am running GPFS 3.3.0.7 on a Redhat Linux 5.3 cluster. I've enabled SNMP as described in the IBM Admin. Guide and verified it connects correctly by examining the log file. The problem I am having is that snmp requests sent to the GPFS node acting as the SNMP agent will quite frequently timeout. Oc ...
We have a restripe that appears to be stuck, with extremely long waiters. A disk was suspended and the restripe was being done prior to the mmdeldisk (though I'm sure the restripe wasn't necessary before the deldisk). Otherwise, this restripe does not appear to have been anything other than a typica ...
.It is said that with 64 bit kernel, the number of prefetchThreads plus worker1Threads <= 550....For prefetchThreads, we can find the current number and the maximum it has ever reached (high water mark) from mmfsadm dump threads command....Is it possible to find the current number and the maximum ...
I wrote a script to generate files with the integer containing character length as a prefix.-rwx---r-x 1 root root 0 Nov 8 18:31 084กกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกกก ...
.It is said that the maximum value of prefetchThreads and worker1Threads must be <=550....1. Is this still valid for GPFS 3.3/3.4?.2. It looks like a customer of mine who is using p595 (32 CPUs) has reached this limit, are there anything that we can do?.===root@bgcgcdb1 /oracle> sh chk_gpfs_pt ...
Since a couple of days, we are seen lots of waiters (no especially under heavy load) and always found the the nodes in troubles are the way :gcsicxxwn: 0x416F7FB0 waiting yyyy.yyyy seconds, SharedHashTab fetch handler: on ThCond 0x416EFCC0 (0x416EFCC0) (MsgRecord), reason 'RPC wait' for tmMsgRevoke ...
consistently have a large number of waiters. considering bumping these to the max of 550 what side effects could this produce? Or what would be a good value to start and based on what?root@linuscs103 scripts# mmfsadm dump mb | grep Worker1ThreadsWorker1Threads: max 48 current limit 48 in use 48 wait ...
I have experienced a deadlock for the only one of two running filesystems on gpfs servers. ls, mmdf and so on was not responded.As I investagted, there were no network or disk problems.Especially, on pgfs07 NSD node, CPU was overloaded(70~80%) by mmfsd, and there was no free mem.So,I attatched the p ...
Tried searching the forum but could not find any posts with the same error message. I am trying to run an mmdeldisk on a group of disks. Whether I try and delete them all (with -F option) or just delete a single disk, the mmdeldisk does not finish successfully. It manages to get all of the data off ...
I need help getting UI to work with my File System import java.awt.*;import javax.swing.*;import javax.swing.JLabel;import javax.swing.JTextArea;import javax.swing.JTextField;import java.awt.event.ActionEvent;import java.awt.event.*;class UI extends JFrame implements ActionListener{private JButton f ...
I am trying to rip VCD using vcdxrip.below is what i did:devkit-disks --mount /dev/scd0vcdxrip -i /dev/scd0>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>++ WARN: XA signature not found in ISO9660's system use a ...
In 3.2 testing, I observered openfile counts from dump fs has been decreased as belowTue Jun 14 19:05:00 KORST 2011cached 517961, currently open 6+3, cache limit 520000 (min 10, max 520000), eff limit 520000Tue Jun 14 19:15:00 KORST 2011cached 518014, currently open 12+3, cache limit 520000 (min 10, ...
I am experiencing recurrent hangs on a node of a slave GPFS cluster.The 'mmfsadm dump waiters' report this:0x86975E0 waiting 0.477309000 seconds, SharedHashTab fetch handler: delaying for 0.522687000 more seconds, reason: tcAwaitRecovery pause 1 secand several of these:0x869E848 waiting 0.487770700 ...
In one of our filesystems we are seeing on a specific NSD server a long list of waiters of this kind: mmfsadm dump waiters 0x2AAAAC113B10 waiting 4.753531000 seconds, NSDThread: for I/O completion on disk dm-1500x2AAAAC112840 waiting 1.529843000 seconds, NSDThread: for I/O completion on disk dm-1500 ...
we are experiencing from time to time complete filesystem hangs. The GPFS version is 3.4.0-7, kernel is 2.6.18-194.17.1.el5.What follows is an analysis of the situation. #####################################################DUMP WAITERS: mmdsh -v -N all `which mmfsadm` dump waiters ds-101.cr.cnaf.inf ...
I am Oracle DBA, new to Informix. I am learning Informix as I go, so please bear with me.I create a new database server called "onqadb". (IDS 9.3 on AIX 4.3).I used ISM to create the database server. When I clicked on ON-BAR on the left nav bar, I only see the error. I see the file on the ...
I have been making some I/O test, in my storage system and I have seen something curious, I can write better than read from my fs's. We are running latest update version of GPFS 3.2.1-25.All the FS are under RAID5 or RAID6 (this should introduce a penalty on the write side and not in the read).I hav ...
I have an issue with the IBM JRE completely crashing on several machines. The following is an a few sections of a core dump. Any ideas on what might be the root cause of the issue or where else to look.Thanks RyanNULL 0SECTION TITLE subcomponent dump routineNULL ===============================1TISIG ...
I've just downloaded and installed latest SDK fix pack for WAS 5.1, now java -version result is :java version "1.4.2"Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2)Classic VM (build 1.4.2, J2RE 1.4.2 IBM Windows 32 build cn142sr1a-20050209 (JIT enabled: jitc))After starting WAS and in ...
HiWhen I try to run WAS 5.1 server in RAD 6 debugger it fails with a Java core dump. But the application starts up when run in normal mode. Following is a snippet of information present in java dump file and I have attached the complete file. Can some one please help resolve this issue?Thanks,RamNUL ...
I am using Rational Robot to Record a Client Installation(.exe file). The installation is done through Install Shield.Record & Replay is hapenning when I select the "Low-level Recording"I am UNABLE to Record when I use the following options1) Low-level Recording is OFF2) Add Verification Points ...
Hello, so I switched my 'data only' drives to JFS. I had to shut my computer down by holding the power off button (it froze when running a screensaver) Now both the drives say this messege in both GParted & PYSDM:wrong fs type, bad option, bad superblock on /dev/sdb1, missing codepage or helper ...
when i insert 44Mb of sqldump file to table using ndb cluster engine the ndb fs size is very bigger than sql dump file [root@test bin]# ls -alh cq.sql -rw-r--r-- 1 root root 44M 24 Feb 22:00 cq.sql [root@test bin]# du -sh /root/mysqldata/ndb_2_fs/ 20M /root/mysqldata/ndb_2_fs/ [root@test bin]# ./mys ...
i am also seeing the similar problem.. the code is here..#include<stdio.h>#include<string.h>int palindrome(char *a,int n);int main(){char a[20];int b,n;printf("Enter the string to be checked\n");scanf("%s",a);printf("The entered string is ");puts(a);printf( ...