It seems all too often that when users are unable to access an end-user business function protected by a IDAM (Identity and Access Management) solution, the IDAM system gets the brunt of the blame and in a lot of cases without justification. Today’s corporate web based business functions are comprised of complex systems based on a service oriented applications. As such, it can be difficult to diagnose particular issues in a timely manner to preclude having to restart several components. As the issue persists, security controls may be removed or bypassed all together resulting in another set of problems. In many cases the root cause does not get identified and a repeat incident occurs.
Example Use Case
Consider a system that hosts a web application providing an end-user business function to allow users to sign up for service and be able to pay their bills online. To protect the web application, an Oracle IDAM system, referred to as the SSO Stack, is implemented to provide access control and data protection for the end-users. As you can see, there are a lot of complicated flows and dependencies in the systems.
Suppose an issue has been reported by an end-user and technical support personnel are logged in to try and resolve the issue. To illustrate the complexity of the issue, suppose an end-user cannot access the system to pay their bill. Without having an in-depth knowledge of what is going on inside the systems, it is difficult to determine if the web application is the problem or if the problem is related to the SSO Stack. If it is the SSO Stack, which component is at fault?
Remember the movie, the matrix, “take the red pill” and find out what is really going on in the matrix. “Take the blue pill” and you live in ignorance and bliss. When troubleshooting systems, the tendency is to: collect and analyze logs on each of the system components independently, trouble-shoot at the network level, and execute manual user tests, all time consuming. How many times have you heard someone say “I can ping the server just fine” yet the problem persists.
“What if I told you”, testing at the application layer provides a more accurate indication of what is really going on inside the system. The business functionality is either working as intended or it is not. Applications performing the business functions can be modeled as services and tested in real-time. Service tests can measure the end-user’s ability to access a service and if automated, allow issues to be resolved before end-user complaints start rolling in. Service tests strategically placed in each critical subsystem can be used as health checks determining which system component may be at fault if there are reported issues.
EM12c Cloud Control Service Model
With EM12c Cloud control, business functions can be modeled as services to be monitored for availability and performance. Systems can be defined based on target components hosting the service. As a service is defined, it is associated with a system and one or more service tests. Service tests emulate the way a client would access the service and can be set up using out-of-the-box test frameworks: web testing automation, SQL timing, LDAP, SOAP, Ping tools, etc. and can be extended through Jython based scripting support. The availability of a service can be determined by the results of service tests or the system performance metrics. The results of the system metrics can be utilized in system usage metrics and in conjunction with service level agreements (SLAs). Additionally, aggregate services can be modeled to consist of sub-services with the availability of the aggregate service dependent on the availability of each of the individual sub-services.
Example Use Case Revisited with EM12c Service Model
Revisiting the issue reported in the previous use-case, it was not a trivial task in determining whether it was or was not an SSO issue and which component or components were at fault. Now consider modeling the consumer service and running web automation end-user service tests against the web application. Consider the SSO stack as a service modeling the Identity and Access Management functionality. The SSO Stack can be defined as an aggregate service with the following subservices: SSO Service, STS Service, Directory Service and Database Service. The availability and performance of the SSO Stack can be measured based on the availability and performance of each of the subservices within the SSO Stack chain. Going back to the problem reported in fig 1, the end-user could not access the web application to pay their bill. Suppose service tests are set up to run at the various endpoints as illustrated in figure 3. As expected, the end-user service tests are showing failures. If the service tests for the Directory Service and Database are passing, it can be concluded the problem is within the OAM server component. Looking further into the results of the SSO Service and STS Service the problematic application within the OAM server can be determined. As this illustration points out, Service tests provide a more systematic way of trouble shooting and can lead you to a speedier resolution and root cause.
Em12c Cloud Control Features
The following are some of the features available with the EM12 Cloud Control monitoring solution to provide the capabilities as mentioned not available from the basic Enterprise Manager Fusion Middleware Control.
- Service Management:
- Service Definition: Defining a service as it relates to a business function. Modeling services from end-to-end with aggregate services.
- Service tests: Web traffic, SOAP, Restful, LDAP, SQL, ping etc. to determine end-user service and system level availabilities and performance.
- System monitoring. Monitoring a group of targets that represent a system that is intended to provide a specific business function.
- Service level agreements (SLAs) with monitoring and reporting for optimization.
- Performance monitoring
- Defining thresholds for status, performance and alerts
- Out-of-the-box and custom available metrics
- Real-time and historical metric reporting with target comparison
- Dashboard views that can be personalized.
- Service level agreement monitoring
- Incident reporting based on availability and performance threshold crossing, escalation and tracking from open to closure. Can also be used to track SLAs.
- System and service topology modeling tool for viewing dependencies. Can help with performance and service level optimization and root cause analysis.
- Oracle database availability and performance monitoring:
- Throughput transaction metrics on reads, write and commits
- DB wait time analysis
- View top SQL and their CPU consumption by SQL ID.
- DBA task assistance:
- Active Data Guard and standby Management
- RMAN backup scheduling
- Log and audit monitoring
- Multi-Domain management: Production, Test, Development with RBAC rules. All domains from one console.
- Automated discovery of Identity Management and fusion middleware Components
- Plug-ins from 3rd party and developer tools with Jython scripting support to extend service tests, metrics etc.
- Log pattern matching that can be used as a customizable alerting mechanism and performance tool.
- Track and compare configurations for diagnostics purposes.
- Automated patch deployment and management.
- Integration of the system with My Oracle Support
As a final note and why it is referred to as EM12 Cloud Control
One of the advanced uses of Oracle Enterprise Manager 12c is being able to manage multiple phases of the cloud lifecycle—such as the planning, set up, build, deployment, monitoring, metering/chargeback, and optimization of the cloud. With its comprehensive management capabilities for clouds, Oracle Enterprise Manager 12c enables rapid deployment and end-to-end monitoring of infrastructure as a service (IaaS), platform as a service (PaaS)—including database as a service (DBaaS), schema as a service (Schema-aaS), and middleware as a service (MWaaS).