/* ・メモ g++ -O2 -S -m64 -march=sandybridge ・何にはまっていたのか? こっちの環境でコンパイルした時、配列Aのアライメントが仮定されたコードが出力されたようだ。 しかし、向こうの環境ではアライメントされていなかった。よってアライメントエラーでREとなっていた。 */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef MY_LOCAL_RUN #include #endif #define rep(i,n) for(int (i)=0;(i)<(int)(n);++(i)) #define rer(i,l,u) for(int (i)=(int)(l);(i)<=(int)(u);++(i)) #define reu(i,l,u) for(int (i)=(int)(l);(i)<(int)(u);++(i)) #if defined(_MSC_VER) || __cplusplus > 199711L #define aut(r,v) auto r = (v) #else #define aut(r,v) __typeof(v) r = (v) #endif #define each(it,o) for(aut(it, (o).begin()); it != (o).end(); ++ it) #define all(o) (o).begin(), (o).end() #define pb(x) push_back(x) #define mp(x,y) make_pair((x),(y)) #define mset(m,v) memset(m,v,sizeof(m)) #define INF 0x3f3f3f3f #define INFL 0x3f3f3f3f3f3f3f3fLL using namespace std; typedef vector vi; typedef pair pii; typedef vector > vpii; typedef long long ll; template inline void amin(T &x, U y) { if(y < x) x = y; } template inline void amax(T &x, U y) { if(x < y) x = y; } const double *g_A; int g_N; double g_s_inv; long long g_res; #ifdef MY_LOCAL_RUN extern "C" void sum_trunc_mul(); __declspec(noinline) void sum_trunc_mul() { const double *A = g_A; int N = g_N; double s_inv = g_s_inv; __m256d inv = _mm256_set1_pd(s_inv); int i = 0; __m256d sum = _mm256_setzero_pd(); for(; i + 3 < N; i += 4) { __m256d a_d = _mm256_load_pd(A + i); __m256d prod = _mm256_mul_pd(a_d, inv); __m256d truncated = _mm256_round_pd(prod, (_MM_FROUND_TO_ZERO |_MM_FROUND_NO_EXC)); sum = _mm256_add_pd(sum, truncated); } double sum4[4]; _mm256_storeu_pd(sum4, sum); long long res = 0; rep(k, 4) res += (ll)sum4[k]; for(; i < N; ++ i) res += (ll)(A[i] * s_inv); g_res = res; } #else extern "C" void sum_trunc_mul(); __asm( ".text\n" "sum_trunc_mul:\n" ".L_FB4544:\n" " subq $40, %rsp\n" " movl g_N(%rip), %r9d\n" " vmovsd g_s_inv(%rip), %xmm2\n" " movq g_A(%rip), %r8\n" " vmovddup %xmm2, %xmm3\n" " vinsertf128 $1, %xmm3, %ymm3, %ymm3\n" " cmpl $3, %r9d\n" " jle .L_10\n" " leal -4(%r9), %ecx\n" " xorl %eax, %eax\n" " vxorpd %xmm1, %xmm1, %xmm1\n" " shrl $2, %ecx\n" " movl %ecx, %edx\n" " addq $1, %rdx\n" " salq $5, %rdx\n" " .p2align 4,,10\n" ".L_6:\n" " vmulpd (%r8,%rax), %ymm3, %ymm0\n" " addq $32, %rax\n" " vroundpd $11, %ymm0, %ymm0\n" " vaddpd %ymm0, %ymm1, %ymm1\n" " cmpq %rdx, %rax\n" " jne .L_6\n" " leal 4(,%rcx,4), %r11d\n" ".L_5:\n" " vmovupd %ymm1, (%rsp)\n" " xorl %eax, %eax\n" " xorl %ecx, %ecx\n" " movq %rsp, %r10\n" ".L_7:\n" " vcvttsd2siq (%r10,%rax), %rdx\n" " addq $8, %rax\n" " addq %rdx, %rcx\n" " cmpq $32, %rax\n" " jne .L_7\n" " cmpl %r11d, %r9d\n" " jle .L_8\n" " subl $1, %r9d\n" " movslq %r11d, %rdx\n" " leaq (%r8,%rdx,8), %rax\n" " subl %r11d, %r9d\n" " leaq 1(%rdx,%r9), %rdx\n" " leaq (%r8,%rdx,8), %r8\n" " .p2align 4,,10\n" ".L_9:\n" " vmulsd (%rax), %xmm2, %xmm0\n" " addq $8, %rax\n" " vcvttsd2siq %xmm0, %rdx\n" " addq %rdx, %rcx\n" " cmpq %r8, %rax\n" " jne .L_9\n" ".L_8:\n" " movq %rcx, g_res(%rip)\n" " vzeroupper\n" " addq $40, %rsp\n" " ret\n" ".L_10:\n" " vxorpd %xmm1, %xmm1, %xmm1\n" " xorl %r11d, %r11d\n" " jmp .L_5\n" "\n"); #endif int main() { int N; { scanf("%d", &N); char *A_d_buf = new char[N * 8 + 32]; double *A_d = reinterpret_cast((reinterpret_cast(A_d_buf) | 31) + 1); rep(i, N) { int A; scanf("%d", &A); // A = 5; A_d[i] = A; } long long K; scanf("%lld", &K); // K = 8; const double EPS = 1e-9; double l = 0, u = 1e9; while(l + EPS < u && l * (1 + EPS) < u) { double mid = (l + u) / 2; double inv = 1. / mid; g_A = &A_d[0], g_N = N, g_s_inv = inv; sum_trunc_mul(); long long sum = g_res; if(sum >= K) l = mid; else u = mid; } printf("%.10f\n", (l + u) / 2); delete[] A_d_buf; } return 0; }