&N Dream up the future lab.

Envision the future
with Nomura Research Institute

Mark Hines, Managing Executive Officer, MS&AD Systems (center-right)
Takeshi  Nagaura, General Manager, Quality Control Department, MS&AD Systems (right)
Tomoaki Kimura, Systems Consulting Division, NRI (center-left)
Kayoko Tsubouchi, Systems Consulting Division, NRI (left)


When it comes to managing emergencies such as systems failures, “incident commanders” are playing an increasingly vital role in taking charge on-site and achieving rapid recoveries. MS&AD Systems, which is single-handedly responsible for operating and managing the information systems of the MS&AD Group’s insurance companies, has partnered with NRI to establish guidelines and train the Group’s personnel in an effort to achieve greater resilience. We spoke with key persons at both companies who are associated with this project about its aims and the progress made thus far.

Pursuing “resilience” to make financial infrastructure more reliable

――What kind of concerns or motivation led to this collaboration?

Hines: Nowadays, those in the financial sector can’t perform their daily operations without a system in place. Because insurance products are intended to provide economic support to clients during emergencies, having a system go down at a critical moment for a client is the kind of situation that we would most like to avoid. So, our thinking was that rather than just preventing such issues from arising, we ought to create a mechanism for rapidly restoring operations if the worst should actually happen, and for minimizing the impact to the client.


Kimura : While conventional thinking emphasizes preventing issues from arising, another approach focuses on addressing systems failures or other such problems that do arise, looking to minimize user impact and achieve a quick recovery. This approach is called “resilience”. Improving resilience requires not only technical efforts, but also a comprehensive approach that covers organizational frameworks, establishing processes, developing human resources, and so forth. In particular, failure response requires “incident commanders” who can take charge on-site and decide on an overall direction and then pursue it, in order to accomplish a rapid recovery.

Nagaura : Large-scale systems failures don’t occur every day, which is why, until now, a limited number of workers have managed situations on-site and handled the response. However, now that systems are becoming more complex and the risks are increasing, we believe we need to operate on the assumption that systems failures will occur despite every effort to prevent them, and rather than relying on specific persons to handle them, we need to transform incident response into a standardized skill and increase the number of personnel who can exercise leadership.

Turning “individual knowledge” into an “organizational asset”

――What approach did you follow in undertaking this project?

Nagaura : First, we started off by defining the term “incident commander” and creating a job description for it. That is, what sort of setup and steps would be involved, and what needs to be kept in mind when pursuing tasks? We documented these details to turn individual tacit knowledge explicitly into a formal knowledge base, in order to make it available to other members and equip them to respond.

Tsubouchi : At large organizations, it’s important to have these definitions in writing. In addition to defining and laying out the role of an incident commander, we created a set of failure response guidelines summarizing the key topics, such as desirable response structures for handling systems failures, response flows, and the like. What we kept in mind was the need to entrench these guidelines within MS&AD Systems, while leveraging NRI’s knowledge and insights in the process. We also focused on thoroughly understanding the background details of their business and their on-site practices, and on getting direct feedback from the people working there.

Nagaura : In terms of our efforts to make sure that the core of our newly defined knowledge can take root on-site, we asked all of our divisions to recommend persons at the general manager level, as well as candidates for next-generation incident commanders, and we conducted two training sessions, each for 20 such persons.

――What sort of training did you conduct?

Kimura : With this project, we put the emphasis on developing incident commanders in particular, trying to systematically teach the practice of failure response, which until now had been a mass of tacit knowledge. Through classroom lectures as well as case studies and workshops, we found ways of getting the participants to engage in deeper dialogues with each other, and to give them a feel for things in terms of practical applications.

[Photo] Through workshops and examining case studies, the members of different divisions engaged in lively discussions and shared their perceptions

Nagaura : While discussing the case studies, the participants were able to share some of the creative solutions from their various divisions as well as common challenges. I believe the participants realized as well that they had an opportunity to learn advanced methods from a leading expert in the field like Mr. Kimura. Their responses to the post-training survey were even more positive than we’d imagined.

Kimura : The fact that knowledge could be shared across divisions, for example, is one of the many positive comments we were happy to receive. I think this will have been a useful exercise for cultivating a mindset and a culture of collaborating and supporting each other in times of need.

Ongoing persistence and improvements to ensure swift restoration of operations

――Please share any final comments you may have on the project or any future prospects.

Tsubouchi : This project was made possible through a combination of strong awareness at the management level and a motivation to improve among those on the ground level. With the business environment and technology undergoing significant changes, the importance of incident commanders will only continue to grow. However, developing human resources and establishing frameworks won’t happen overnight. I think that’s precisely why it’s essential to have a medium- and long-term plan, one that goes beyond one-off measures, and that also includes personnel allocation.

Nagaura : No matter how superb your training program is, if you only conduct it once, it will ultimately be forgotten. The challenge going forward will be to craft a framework that makes it possible to get routine information updates, and to maintain and improve the skills and knowhow acquired through practice. This current effort is akin to sowing the seeds, and even if you’re sowing the same seeds all around, they’ll all grow differently from one division to another. From a quality control perspective as well, I’d like to pursue standardization in a way that eliminates the variations among divisions and gets the whole Group aligned in the same direction.

Kimura : Improving system operation quality is something of a universal theme, but the surrounding environment and the methods are evolving. The AI-driven “development” we’re seeing recently is dramatically expanding the scale of these systems. That said, if you rush into operations without a prepared plan, there’s a risk that the task will exceed your management capabilities and lead to major problems. While failure response does also require the use of AI, the final judgment has to be made by a human being. In that sense as well, I think that the development of incident commanders, which has been the focus of this project, will also be the key to success when it comes to utilizing AI.

Hines : With DX (Digital Transformation) becoming firmly established, the utilization of IT systems is becoming an indispensable part of operations in every business, but that means that the risks are also getting more complex. Rather than simply reducing the problems, the most important thing when an incident occurs is to minimize the impact to the client and to operations, and to achieve a rapid recovery. Sorting out who should be doing what in order to restore operations quickly will be a huge asset for companies. My hope is for us to use what we’ve created here as a foundation and keep on refining it going forward so that our organization can demonstrate a high level of resilience when emergencies do arise.

Profile

  • Tomoaki KimuraPortraits of

    Tomoaki Kimura

    IT Architecture Consulting Department

    Joined NRI in 2002.
    He gained extensive experience with failure response through involvement in developing, maintaining, and operating financial business systems. Subsequently, he was engaged in technical and service development for enhancing system operations. He currently specializes in IT service management, working on improving internal and external system operations, as well as serving as a training instructor in the area of improving failure responses. He is an NRI-certified IT service manager.

  • Kayoko TsubouchiPortraits of

    Kayoko Tsubouchi

    IT Architecture Consulting Department

    Joined NRI in 2009.
    She was involved in developing, maintaining, and operating business systems for insurance companies, and is currently engaged in consulting focused on improving and advancing system operations, conducting support activities for enhancing system resilience.

* Organization names and job titles may differ from the current version.